The Penn Wharton Budget Model microsimulation (PWBMsim) is a model of the United States economy capable of projecting a rich array of demographic and economic variables. Its key goal is to capture the effects of government tax and spending policies under projections of the nation's evolving demographic and economic landscape. The PWBMsim is able to capture and analyze both the micro-distributional and macroeconomic effects of alternative fiscal policies. This document describes the goals of the PWBMsim and provides an overview of how it is constructed.


Many public and private institutions are engaged in public policy research. However, we perceive many shortcomings in the models and approaches that are being used to generate estimates of national economic performance and government budget projections. This results in policymakers not receiving the most accurate and reliable advice on the effects of alternative public policies at the micro- and macro-economic levels.

PWBMsim’s objective is to create a computational tool that incorporates all of the necessary details about the nation's economic and demographic profile and to integrate those data in an internally consistent manner. At the same time, the tool should be sufficiently easy to use by policymakers and the interested public to explore the implications of a wide array of economic policy issues. The key medium-term goal of PWBMsim is to build a model with sufficient detail and flexibility to estimate the economic effects of alternative federal budget policies.


PWBMsim uses information from U.S. micro-data surveys to construct (in computer storage) a representation of the U.S. population and economy using simulation methods. At the micro-level, PWBMsim constructs a population of individuals for the initial year (1996) organized into families of various types. PWBMsim uses micro data information from many sources (described below). One key data source are the publicly available annual CPS surveys, which contain samples of U.S. residents that are representative of the United States resident (non-institutionalized) population in various years since 1964. PWBMsim uses CPS micro-data surveys beginning in 1996.

Individuals in PWBMsim are assigned attributes such as race, gender, immigration status, family type, education, etc. When each individual is created, the attributes are assigned in sequence, and each new attribute assignment is conditioned on a suitable collection of previously assigned attributes. The assignments are done according to corresponding conditional frequency distributions of those attributes as observed in micro-data sources. Applying this procedure to each attribute (as described in greater detail in Appendix 1) yields PWBMsim's 1996 U.S. population. Such a simulated population of individuals matches the U.S. population quite closely as demonstrated later in this document.

PWBMsim incorporates most of the key attributes at the family and individual level and aggregates them to generate estimates of macroeconomic and demographic indicators – levels and growth rates of the U.S. population, labor-force, employment, population dependency ratios, GDP, worker compensation, the economy’s capital stock, taxable income, federal revenues, federal spending, and other variables of interest. To do so PWBMsim draws micro-data information from many national surveys beyond the CPS, such as the Panel Study of Income Dynamics (PSID), the Survey of Consumer Finances (SCF), the Survey of Income and Program Participation (SIPP), the Consumer Expenditure Survey (CEX), and others.

A detailed tool for analyzing federal budget policies must comprehensively include federal government tax and spending programs. When fully developed PWBMsim will provide policymakers and the public with a tool to explore systematically the effects of different fiscal policies on the federal budget and the performance of the U.S. economy at both micro- and macro-economic levels.

Individuals and Families

On the micro-economic side, PWBMsim models the evolution of many demographic variables for families and individuals. The microsimulation model begins with a population of simulated individuals as of 1996, which are drawn to resemble the characteristics of the U.S. population in that year. The family head is created and the immigration-status attribute (whether foreign-born) is assigned first. This assignment is done based on the frequency of foreign-born individuals in CPS1996. Next, the family-type attribute is assigned to the family head, that is, whether the head will be a single individual or a single-headed individual (with children) or a married individual with or without children in year 1996. Again, the random assignment of family-type is controlled by the frequency of alternative family types in CPS1996. Depending on the family-type assigned, other family members are created in a similar way. All family individuals are assigned other attributes such as ethnicity, gender, education level, disability status, employment status, work weeks in the year, and so on (see Appendix 1).

Once all 1996 population attributes are assigned, the simulated population is transitioned through subsequent years. The annual transitions involve deciding whether each attribute of each individual changes in the following year and how it changes. Each person becomes older by 1 year (aging), some women give birth (fertility), some individuals die (mortality), singles marry, married individuals divorce, children and adults acquire education, children move out of their parent families as they enter adulthood; adult and near-adult individuals work, some become disabled, workers earn wage and non-wage compensations, consume and save, new individuals immigrate, non-natives changes immigration status, some of them emigrate, and so on.

All year-over-year transitions of individual-level demographic and economic attributes are governed by micro-data information on transition rates across alternative attribute states. For example, PWBMsim calibrates individuals’ educational attainment by estimating rates of transition across single years of education from one (at age 6) through 18 through adulthood and beyond. Such Markov transition probabilities govern the evolution of variables such as marriage, divorce, labor-force entry and exit, immigration and emigration, entry into and exit from disability, and so on, where each transition is calibrated according to conditional transition rates calculated from micro-data surveys. As another example, labor-force status of adult males and females are distinguished between four states (not-working/wage-only worker/self-employed worker/both wage and self-employment) and these states are distinguished by the person’s immigrant status, family type (single- or dual-headed family), gender, ethnicity, age group, and education group. Such a fine-grained decomposition of attributes, into states by population subgroups, accommodates a large variety of potential interactions in the future evolution of attributes across states for different population subgroups.

The conditioning variables for each transition rate structure (to distinguish transitions for different population subgroups by age, ethnicity, gender etc.) are selected based on information about the “variables that matter.” For example, if the transition rates into and out of disability are no different for males and females, gender is not used as a conditional variable for calculating transition probabilities into and out of disability.

A marriage component explicitly captures the evolution of racial and income composition of families. Here, in a search and matching component of the model, unmarried singles are randomly paired with prospective partners, with whom they may form matches. The match probabilities are calibrated to match the observed joint demographic composition of male and female married couples. The microsimulation model therefore captures important dynamics introduced into the demographic makeup of the United States through assortative pairing of singles by income and race.

Ethnicity is a key factor for calibrating “meeting” rates of potential married couples; and educational attainment is included in conditioning marriage acceptance rates. See the section on marriage calibrations below for a more detailed discussion of the marriage “meeting and acceptance” model. See also Appendix 2 for a detailed definition of transition probabilities by attribute and the corresponding conditioning variables. (See Appendix 3 for a description of PWBMsim’s marriage-divorce modules.

Markov transition rates governing annual changes in attribute states are estimated separately for different historical phases within the time span 1996-2016. The transition outcomes of PWBMsim are validated by comparing the distributions of attributes across states generated by PWBMsim and the distributions estimated from CPS micro-data. For all demographic and economic attributes variables, trends in Markov transition probabilities are projected forward into the future, allowing the microsimulation model to capture interactions between attributes as they evolve through the years. This process generates projections of all demographic attributes of the population at the micro-level, thus characterizing the likely future path of the nation's demographic profile. Of course, because the simulation involves random assignment government by conditional distribution functions and Markov transition probabilities, each forward simulation incorporates "simulation variation" in outcomes. Running many such forward simulations yields an estimate of the potential range of projected outcomes for each micro-level attribute and each aggregated (macroeconomic) variable arising from the trends and attribute interactions built into PWBMsim.

PWBMsim includes a labor market component, which governs the evolution of employment, unemployment, wages, and self-employment earnings. Individuals transition to and from employment, conditional on their employment history, their family status, race, and other associated demographic characteristics. For those who are employed in a given year, each worker’s labor contribution (or a “core labor input” index to be applied to labor hours) is estimated using coefficients from a regression model of observed wage earnings (from appropriately harmonized CPS micro-data on wages during 1996-2016) on all of workers’ demographic characteristics. Thus, each PWBMsim worker’s labor input is influenced by the entire constellation of demographic and economic characteristics, and not just by a few selected ones – such as age and education that is standard in estimating labor input indexes. (Appendix 4 provides details on the earnings regression model.)

Estimation of Macroeconomic Variables

On the macroeconomic side, the PWBMsim projects macroeconomic aggregates such as GDP, employment, work-hours, the capital stock, the capital-labor ratio, and labor productivity. It also constructs projections of federal budget aggregates such as total revenues, expenditures, the federal deficit, and government debt.

Several models have been constructed by government agencies and think tanks in order to study the budgetary and distributional implications of government policies. The microsimulation model distinguishes itself in explicitly modeling a much richer set of economic and demographic variables, as well as the interactions between these variables. A key feature of PWBMsim is the integration of all micro-level attributes within a growth model. This enables the model to capture and project the implications of projected changes over time in the composition of micro-demographic and economic attributes for overall labor productivity and economic growth. This is vastly different from the procedures adopted by other “forecasting” efforts – of imposing assumed rates of labor-force participation, employment, and productivity growth estimated from historical information. A notable feature of PWBMsim is the consistency in its micro-demographic, macroeconomic, and federal budget projections.

At the macro level, a neoclassical production function is used to aggregate upward individual labor into macroeconomic variables. The model takes labor earnings, capital, and multifactor productivity as inputs into gross domestic product. The growth rate of multifactor productivity is calibrated to match data from the Bureau of Labor Statistics multifactor productivity accounts. The labor share of output is aggregated from individual earnings in the economy. The capital share is projected forward, based on historical data and the resulting level of capital earnings set to match labor earnings. Together, these three factors enter into a production function to produce projected gross domestic product.

In the remainder of this document, the technical details of the microsimulation model are described in more detail. It begins with an overview section, describing the model in generality. Following sections describe in more detail the individual components of the model.