## Appendix 3: Marriage and Divorce Calibrations

This section describes how marriage and divorces processes are calibrated and implemented in `PWBMsim`

.

## Transitions from Singlehood into Married Status

Single females and males of age 18 through 84 participate in the marriage market. All eligible single females (primary individuals) meet 12 eligible single males (secondary individuals) per year in the following way. Six of them are randomly selected from the same race as the female and six are selected randomly from the entire population of eligible males. Meetings are random for other dimensions (`education`

, `age`

of male). At each meeting, she can either accept the male and form a marriage or reject him and move on to the next meeting. Once she accepts a male, she stops receiving further meetings and exits the pool of singles. If she rejects all 12 meetings, she stays single in the current year. The probability of acceptance is a function of {female race, female education group, female age group, male race, male education group, and male-female age gap group}. Marriages of the male side are determined as the outcomes of the meeting and acceptance process for all eligible females.

## Transition from Married to Single Status

Marriages are also subject to the risk of divorce. The probability of divorce is a function of {female race, female education group, and female age group}. Marriages may end because of the death of a spouse, which makes the remaining spouse enter the single pool in the following year.

The “meeting step" of the `PWBM`

method introduces a pairing of primary and secondary individuals that can be calibrated to the size or availability of potential partners of “own type" versus “other type" in the marriage pool. In its application, a male and a female from the same racial/ethnic group may have greater chances of meeting each other than a male and a female from different racial groups. For some racial types, this may happen simply because of their large shares of the population (whites in the United States). However, even for smaller population groups, own-race meetings may dominate possibly because of preference and greater knowledge about own types and natural residential segregation, making pairings and marriage more likely among individuals with similar characteristics.

Next, whether the current pairing is transitioned into a marriage (the “acceptance step") is calibrated according to the particular characteristics of both individuals in that pairing. These characteristics include age, race and education. There are many advantages of this modified-two-step approach:

• Given residential and school segregation across racial groups, it is reasonable to assume that the observed high fraction of same-race marriages is potentially caused by frequent meetings within a race. (In 2000, the median black person lived in a neighborhood that was 52 percent black (Easterly 2009 ). In 2001, 80 percent of Latino students and 74 percent of black students attend schools where whites were not the majority (Orfield, Kuscera, and Siegel-Hawley 2012 ). Shin (2014) estimates meeting opportunities and marital preferences of a behavioral model of the U.S. marriage market and shows that individuals have greater chances of meeting within own races compared to random meetings, which explains most of the racial homogamy patterns.)

• Since meetings occur among all singles instead of a smaller set of singles who are designated for marriage, the pool of potential spouses is not greatly affected by the order of simulation. This method does not suffer from the residual match problem.

• This modified-two-step approach limits its analysis to ‘meetings’ instead of all potential matches, it does not rely on unrealistic assumptions about complete information.

• It is computationally more efficient than the standard stochastic two-step approach, which requires computing payoffs for all potential matches. Even though `PWBM`

’s modified two-step approach starts with a bigger pool of eligible individuals (all singles), a limited number of meetings are considered for each person, reducing computational burden. (For example, if n females form marriage in a given year, the standard two-step approach evaluates compatibility measures of $n^2$ potential matches. Given a marriage formation rate of 0.05 (among all singles, around 5 percent form marriage in a given year), the total number of females singles would be $\frac{n}{0.05}=20n$. If each single female receives 12 meetings (the calibration of number of meetings would be introduced in the following section, $240n$ potential matches are considered. This is not greater than $n^2$, if the sample size is considerably large (the number of marriages each year, $n\ge 240$).

• As is true in the real world the modified two-step method may result in some singles remaining unmarried in any given year. This may happen when all meetings are followed by non-acceptance based on the quality (probability) thresholds of those meetings. Once a particular pairing is accepted and the couple is married, further meetings are not needed – which also saves computational burden.

• Finally, the proposed modified two-step approach builds on the literature of two-sided random search and matching models pioneered by Mortensen (1988), Pissarides (2000), Shimer and Smith (2000) and Burdett and Coles (1999) and among many others that investigate matching process in labor and marriage markets. Equilibrium conditions derived from a search and matching model are utilized to estimate parameters of the `PWBM`

marriage market.

• `PWBMsim’s`

marriage-divorce algorithm successfully replicates historically observed marriage patterns, in particular, marriage prevalence rates and spousal type distributions of different racial and education groups.

## The `PWBMsim`

Marriage-Divorce Model

### Marriage Formation rates

Technically, marriage formation rates – rates at which females of a given type marry males of different types – should be estimated based on observed patterns of new marriages. However, existing micro-datasets of new marriages are insufficient to account for all of the attributes (natural ones such as age and race, and acquired ones such as schooling) of potential marriage partners. The following section presents a method in which one can infer marriage formation rates based on the stock married couples (all marriages as opposed to new marriages). This method utilizes a steady-state condition of marriage stocks in a population normalized marriage market equilibrium. By applying this technique, one can estimate formation rates of marriages among males and females of different races and education levels.

Consider a population of males and females: $i$ denotes the types of males and $j$ denotes the types of females where $i\in{1,2,\ldots,I}$ and $j\in{1,2,\ldots,J}$. `Race`

(0: white, 1: black, 2 Hispanic, 3: Asian, 4: other), `education`

(0: less than high school, 1: high school graduates and some college, 2: college graduates) determine types, thus there are 15 types in total $(I=J=15)$. Individuals are either single or married. Let $S_i^{m,t}$ denote the number of single males of type $i$, and $S_j^{f,t}$ the number of single females of type $j$ in year $t$. The total number of single males (females) is denoted by $S^{m,t}(S^{f,t})$. $\mu_{ij}^t$ measures the stock of marriages between male $i$ and female $j$ in year $t$. All individuals face the constant probability of death, $\delta$, every year.

Denote $f_{ij}^t$ the rate of female singles of type $j$ forming marriage with a male of type $i$. It means that there are $f_{ij}^t S_j^{f,t}$ new marriages in year $t$. However, some marriages between $i$ and $j(\mu_{ij}^t)$ dissolve, in the case of divorce (rate: $d_{ij}^t$), and the death of one or both spouses (rate: $\delta+\delta-\delta^2$), at the total rate of $\widetilde{\delta}_{ij}^t(\widetilde{\delta}_{ij}^t\equiv d_{ij}^t+\delta+\delta-\delta^2)$. Thus the change in the stock of $i$,$j$ marriages would be the following.

$$ \underbrace{\mu_{ij}^{t+1}-\mu_{ij}^t}_{\text{Change in}i\text{,}j\text{marriages}} = \underbrace{f_{ij}^t S_j^{f,t}}_{\text{Inflow to}i\text{,}j\text{marriages}}-\underbrace{\widetilde{\delta}_{ij}^t \mu_{ij}^t}_{\text{Outflow from}i\text{,}j\text{marriages}} $$

Based on equation, given the measure of marriages ($\mu_ij^t$ and $\mu_ij^{t+1}$), the measure of single females ($(S_j^{f,t}) $) and marriage dissolution rates ($\widetilde{\delta}_{ij}^t$), one can estimate the formation rate of i,j marriage, $f_{ij}^t$, where

$f_{ij}^t=\tilde{\delta}_{ij}^t\frac{\mu_{ij}^t}{S_j^{f,t}}+\frac{\mu_{ij}^{t+1}}{S_j^{f,t}}-\frac{\mu_{ij}^t}{S_j^{f,t}}$ Equation shows that the formation rate ($f_{ij}$) will be higher in the following cases: 1) There are more of $j$ females who are currently married to $i$ males, where this $\mu_{ij}$ is normalized to the number of $j$ females who are single $(\text{high}\frac{\mu_{ij}^t}{S_j^{f,t}})$. 2) Marriage $ij$ is unstable $(\text{high}\widetilde{\delta}_{ij}^t)$. 3) There is an increase in the measure of $ij$ marriages $(\text{high}\frac{\mu_{ij}^{t+1}}{S_j^{f,t}}-\frac{\mu_{ij}^t}{S_j^{f,t}})$. Instead of estimating the formation rate for each year, we estimate the formation rate that governs marriage patterns from 1995 to 2013. Equation changes accordingly: (Normalization (division by $S_j^f$) is done year-by-year, departing from Equation. Years from 1995 to 2013 are the historical simulation period for `PWBM`

.)

$$ f_{ij} \approx \widetilde{\delta}_{ij} \sum_{t=1995}^{2013}\frac{\mu_{ij}^t}{S_j^{f,t}} /19+\sum_{t=2005}^{2013}\frac{\mu_{ij}^t}{S_j^{f,t}}/9-\sum_{t=1995}^{2004}\frac{\mu_{ij}^t}{S_j^{f,t}}/10 $$

If there is no change in stocks (the last two terms), the formation rate is determined to offset marriages dissolved during the period, which I call the steady-state formation rate $(f_{ij}^{SS}=\widetilde{\delta}_{ij}\sum_{t=1995}^{2013}( \frac{\mu_{ij}^t}{S_j^{f,t}})/19)$. For the final estimates, $f_{ij}$, the baseline level $f_{ij}^{SS}$ is adjusted to take into account changes in stocks. The changes only compare the first period (1995-2004) and the second period (2005-2013). However, equation sometimes yields undesirable outcomes where $f_{ij}\lt 0$, data-preserve-html-node="true" when the change term dominates the baseline term. Thus, we adjust the baseline term proportionally, considering the rate of change in $\frac{\mu_{ij}^t}{S_j^{f,t}}$. (We can think of three c, $f_t$, ${f_1,f_2,f_3}$. If the change of $f$ is $r$ and constant over time, $f_3=(1+r)f_2=(1+r)^2 f_1$. Thus, $(\frac{f_3}{f_1})^{\frac12}=(1+r)$ and $f_3=(\frac{f_3}{f_1} )^{\frac12}f_2$ where I approximate $f_1=\sum_{t=1995}^{2004}\frac{\mu_{ij}^t}{S_j^{f,t}}/10, f_2=\sum_{t=1995}^{2013}\frac{\mu_{ij}^t}{S_j^{f,t}}/19$, and $f_3=\sum_{t=2005}^{2013}\frac{\mu_{ij}^t}{S_j^{f,t}}/9$ respectively.)

\begin{align} f_{ij}^{\text{final}} &=\widetilde{\delta}_{ij} (\sum_{t=1995}^{2013}\frac{\mu_{ij}^t}{S_j^{f,t}}/19)(1+\frac{\sum_{t=2005}^{2013}\frac{\mu_{ij}^t}{S_j^{f,t}}/9-\sum_{t=1995}^{2004}\frac{\mu_{ij}^t}{S_j^{f,t}}/10}{\sum_{t=1995}^{2004}\frac{\mu_{ij}^t}{S_j^{f,t}}/10})^{\frac12} \\ &=\widetilde{\delta}_{ij}(\sum_{t=1995}^{2013}\frac{\mu_{ij}^t}{S_j^{f,t}}/19) (\frac{\sum_{t=2005}^{2013}\frac{\mu_{ij}^t}{S_j^{f,t}}/9)}{\sum_{t=1995}^{2004}\frac{\mu_{ij}^t}{S_j^{f,t}}/10})^\frac12 \end{align}

### Acceptance Probabilities

As described above, marriage formation occurs in two steps: first, singles meet each other, and second, some of those meetings lead to marriage while others do not.

Assume that each single female meets $N^f$ single males each year. With the fraction of $r^{\text{same}}$, she meets single males of the same race. The rest $(1-r^{\text{same}})N^f$ meetings are random among all single males. Thus, the number of meetings between males of race $r$ and females of race $r'$, $M(r,r')$, are given by the following. ($N^f$ is calibrated to 12 in the simulation and $r^{\text{same}}$ is 0.5. Every single female meets 6 males from the same race and 6 males randomly every year. Nevertheless that acceptance rates are adjusted accordingly based on the choice of $N^f$ and $r^{\text{same}}$, there are several reasons that justify the current levels: 1) The smaller number of $N^f$, meaning to higher acceptance rates given meeting, leads to unstable patterns at the subsequent adjustment of parameters introduced in the later section. 2) With the smaller number of $r^{\text{same}}$, marital patterns across different races become more sensitive to the changes in the racial distribution. 3) Under the current parameters, all racial groups (except for the other group) still accept the same-race spouse more easily than people from different races.)

\begin{align} M(r,r) &= (1-r^{\text{same}})N^f S^f \frac{S_r^m}{S^m}\frac{S_r^f}{S^f} +r^{\text{same}} N^f S_r^f \text{same race} \\ M(r,r') &= (1-r^{\text{same}})N^f S^f \frac{S_r^m}{S^m}\frac{S_{r'}^f}{S^f} \text{different race} \end{align}

The total number of meetings in a year is then given by

$$ M=\sum_i\sum_jM_{ij} =(1-r^{\text{same}})N^f S^f+r^{\text{same}} N^f S^f=N^f S^f $$

In terms of rate, the rate of r' female singles meeting r male singles will be given by the following. (Meeting is based on the distribution of types among single males, which also varies across years. We also pool information (distribution of race and education among single males in recent 9 years) to recover the meeting rate $\frac{M(r,r')}{S_{r'}^f }$. \begin{align} \frac{M(r,r)}{s_r^f} &= (1-r^{same})N^f \sum_{t=2005}^{2013}(\frac{S_r^{m,t}}{S^{m,t}})/9 +r^{same} N^f \text{same race} \\ \frac{M(r,r)}{s_r^f} &= (1-r^{same})N^f \frac{S_r^m}{S^m} \text{different race} \end{align} )

\begin{align} \frac{M(r,r)}{S_{r'}^f } &= (1-r^{\text{same}})N^f \frac{S_r^m}{S^m} +r^{same}N^f \text{same race} \\ \frac{M(r,r')}{S_r'^f} &= (1-r^{\text{same}})N^f \frac{S_r^m}{S^m} \text{different race} \end{align}

In addition to racial traits, we also consider different levels of education. It is assumed that there is no segregation along education lines. The rate for females of race $r' $meeting males of race r and education $e$, $m^f (r,r',e,e')$, reflects the distribution of educational attainment for different races of male singles. Note that this does not depend on female education, but only on female race.

$$ m^f (r,r',e,e')=\frac{M(r,r')}{S_{r'}^f }\frac{S_{r,e}^m}{S_r^m} $$

We have estimated the formation rate of marriages between males of ($r,e$) and females of ($r',e'$), $f(r,e,r',e')$. Given the specification of meeting described above, we can separate the formation rate into two parts: 1) the rate of meeting, $m^f (r,r',e,e')$; and 2) the probability of acceptance given meeting, $\rho(r,r',e,e')$. (The simulation assumes that the estimated acceptance rates are constant over time. This is different from behavioral models where utilities from particular matches are constant and acceptance rates are subject to change based on changes in the environment.)

### Estimation

The section describes how the modeling framework is implemented using several sources of data. More variations in marriage and divorce patterns by race, education and age are examined. This enables estimation of marriage formation parameters as functions of the potential partners’ ethnicity, education, and age. Divorce parameters are also estimated as functions of the “wife’s” race, education and age.

### Data

The estimation of the marriage market is based on two sources of data. The first one is Panel Study of Income Dynamics (PSID) – a longitudinal survey conducted annually between 1968 and 1997, and bi-annually thereafter. The latest PSID data file available is for the year 2015. It has information on marriage history, the month and year of the formation and dissolution of each marriage of sample individuals (Marriage History 1985-2013). Individuals and their spouses are identified by person numbers. Their demographic and socio-economic information (age, race and education) from Family Files can be then matched across years. We restricted the period for the estimation from 1995 to 2013, which is `PWBM`

’s historical simulation period. The second source is the Current Population Survey (CPS). It is a limited longitudinal data set. Information on marital status, age, race, and education on individuals and their spouses is used.

### Divorce Rates

The divorce probability estimation is based on the PSID data. The dependent variable is whether or not a married female ($i$) experiences divorce in year ($t$) ($Y_it$=1: divorce, $Y_it$=0: stay married). Independent variables ($X_it$) include dummies for `race`

(0: white, 1: black, 2 Hispanic, 3: Asian, 4: other), `education`

(0: less than high school, 1: high school graduates and some college, 2: college graduates) and `age groups`

(0: age 18-24, 1: 25-34, 2: 35-54, 3: 55-84).

Estimation is based on the logit model. The probability of divorce is constructed as a function of independent variables and coefficients as follows. For groups who are subject to the risk (married females), $d_it=1$ if divorce occurs in year $t$ and $d_it=0$ otherwise. Coefficient $\beta$ are estimated to maximize the sum of logged likelihood considering individual sample weights ($w_i$).

\begin{align} Pr(Y_{it}=1\mid X_{it}) &= \lambda_{it}=\frac{e^{X_{it}\beta}}{1+e^{X_{it}\beta}} \\ Pr(Y_{it}=0\mid X_{it}) &= 1-\lambda_{it}=\frac{1}{1+e^{X_{it}\beta}} \end{align} \begin{align} L_{it} &= \lambda_{it}^{d_it} (1-\lambda_{it} )^{1-d_{it}} \\ lnL_{it} &= d_{it}ln\lambda_{it}+(1-d_{it})ln(1-\lambda_{it}) \\ lnL &= \sum_iw_i \sum_{t}lnL_{it}=\sum_iw_i \sum_t[ d_{it} ln\lambda_{it}+(1-d_{it})ln(1-\lambda_{it})] \end{align}

Table 1 demonstrates the estimated coefficients ($\beta\text{’s}$) of the equation . Marriages of white females are more stable than black females’ marriages. Highly educated females also show less risk of divorce than females with low education. As females get older, the risk of divorce decreases. Based on these estimates, one can estimate the probability of divorce ($\lambda$) for females of different race, education and age as in Table 2. (For racial groups of Asian and Others, there are not large enough sample for different education groups of those races. Thus, we take estimates where race and education are controlled independently for the logit regression. The parameters have been further adjusted based on the iterated simulation results as presented in the following section.)

### Formation and Acceptance Rates by Race, Education

For females with different race and education, the probability of accepting males of certain race and education can be calibrated based on the marriage market model. (For estimating formation rates of different marriages, we need divorce rates of different races and educations without controls for ages and the death rate. The logit model controlling race and education is estimated based on the PSID data and used as input parameters for formation estimation. The death rate is calibrated as 0.01.) Rates of forming marriages for females of particular races and education levels with males of different races and education levels are first recovered as in Figure 1. Considering different rates of meeting based on female race and distribution of race and education of single males, the probabilities of accepting a meeting can be computed as a function of race and education of males and females. Figure 3 presents estimated acceptance probabilities.

As shown in the spikes along the 45 degree line in the formation rate graphs, females have generally higher formation rates of marriages with males of the same race and education type. Hispanics, Asians and other females have formed marriages with white males. It is also noted that females with lower education have lower rates of marriages than the females with higher education especially among whites and blacks. The patterns of frequent same-type marriages continue at the recovered acceptance probabilities. The spikes over the 45 degree line become larger for minority racial types (e.g., Asians) that have lower chances of meeting their own types. Females more easily accept males of the same education level compared to males with lower or higher levels of education than her.

### Formation Rates by Age

#### Age of females

So far, the probability for $j$ females (e.g., low educated white) accepting $i$ males (e.g., high educated Hispanics) given meeting is estimated. An additional dimension we want to consider is the age of females (0: age 18-24, 1: age 25-34, 2: age 35-54, 3: age 55-84). Given that some females may delay marriage to get education, age-dependent marital patterns may be also interacted with education. Thus, we first estimate marriage formation rates of females (age 18-84) with different education levels $(e')$ in 1995-2013, $f^\text{PSID} (e')$, based on the PSID data. We then estimate the rate of marriage formation for females of different education and age groups $(a')$, $f^\text{PSID} (a',e')$. The shift is then computed by the ratio between $f^\text{PSID} (a',e')$ and $f^\text{PSID} (e')$ and presented in Table 5.

$$ s(a';e')=\frac{f^{\text{PSID}} (a',e')}{f^{\text{PSID}} (e')} $$

Once we apply this shift to our estimates from equation, we can now estimate the acceptance probability for females of different race $(r')$, education $(e')$, and age($a')$ marrying to males of different race $(r)$ and education $(e)$, which is denoted as $p(r',e',a';r,e)$.

$$ p(r',e',a';r,e)=p(r,r',e,e')\times s(a';e') $$

#### Age Gaps between Males and Females

Finally, marriage formation also depends on the gap between male and female partners’ ages. We consider the 9 categories of age gaps (male age minus female age), where the groupings are (0: -21 or less, 1: -20/-9, 2:-8/-4, 3:-3/-1, 4:0, 5:1/3, 6:4/8, 7:9/20, 8:21 and over). The second column of Table 6 shows the distribution for age gaps in all new marriages between 1995 and 2013 in the PSID data.

Females face single males of different age gap, depending on her age. For example, if a female is age 18, the age gap cannot be negative. We consider 4 groups of females (0:18/24, 1:25/34, 2:35/54, 3:55/84) and compute weighted average probabilities of females meeting males with different age gaps. (Weights are based on the distribution of female age at marriage formation observed in the PSID data. For example, since females of age 35 marry more frequently than females of age 54, the age-gap distribution faced by 35-year-old females receives higher weights than the one faced by 54-year-old females.) By taking the ratio between marriage and meeting, we can get shift terms as in the last four columns of Table 6, which are then applied to acceptance rates based on the age gap. As expected, females easily accept males of the same age or males who are 1-3 years older than her. Older females face smaller chances of meeting single males whose ages are similar to them. Thus, they accept similar aged males with higher probability, given meeting a similar aged male.

Once calibrated male-female age gap shifts, $s(a-a';a')$, are multiplied with acceptance probabilities, the final acceptance probability becomes a function of race, education and age of both males and females. The acceptance probability, $p(r',e',a';r,e,a)$, for females of different race ($r'$), education ($e'$), and age $(a')$ marrying to males of different race ($r$) education ($e$) and age ($a$) can be computed as follows. (For example, the probability that white females of low education between the ages of 18-24 accept white male of low education in the same age is computed by $p(W,eduL,W,eduL) \times s(18-24;eduL) \times s(0;18-24) =0.0139 \times 241 \times .83$, which is equal to 0.049.)

$$ p(r',e',a';r,e,a)=p(r,r',e,e')\times(a';e')\times(a-a';a') $$

#### Parameter Adjustments

Simulations using baseline estimates show some discrepancy compared to CPS micro data marriage formation rates. Additional adjustments are applied to improve the fit between SIM results and CPS data. We consider bench marking marriage rates (for female type $j$, given by $\frac{\Sigma_i\mu_{ij}}{S_j^f }$ for recent 9 years.

$$ \text{ratio}=\sum_{t=2005}^{2013}\frac{\text{marriage rate}(\text{SIM})_t}{\text{marriage rate}(\text{CPS})_t}/9 $$

Acceptance rates and divorce rates of females are adjusted based on this ratio. For example, if SIM overestimates the marriage rate, the updated acceptance rates are reduced and updated divorce rates are increased. (This adjustment is also motivated by the steady-state condition of marriage stocks given by $\frac{f_{ij}}{\widetilde{\delta}_{ij}} =\frac{\mu_{ij}}{S_{ij}^f }$. Differences between the real patterns and calibrated values are expected to be corrected by this procedure.)

\begin{align} \text{updated acceptance rate} &= \text{original accptance rate}\ast \sqrt{1/\text{ratio}} \\ \text{updated divorce rate} &= \text{original acceptance rate}\ast \sqrt{{ratio}} \end{align}

This process is applied along the three dimensions: 1) based on the overall marriage rates of all females of age 18-84, 2) based on the marriage rates by 4 age groups, 3) based on the marriage rates by race and education of females, 4) based on the marriage rates by race and education of males, and finally 5) based on the same-race marriage rates. Then this process is iterated five times to get the final parameters of the acceptance rates and the divorce rates previously presented. (Same-race marriage rates compares the ratio between same-race marriages and interracial marriages from SIM and CPS. For example, if SIM over-estimates Asian-Asian marriage frequencies ($\text{SIM/CPS ratio} \gt 1$), acceptance rates of Asian-Asian marriages become smaller ($\sqrt{1/\text{ratio}}$) and acceptance rates of Other-Asian marriages become larger ($\sqrt{\text{ratio}}$) at the next iteration. Divorce rates of females of particular race and education are adjusted based on the discrepancy of counterpart male types. Maximum and minimum shifts are given by 3 and 0.333 and limit the magnitude of this adjustment.)

In summary, the estimation procedure assumes that marriage acceptance patterns observed in years from 2005 to 2013 by different race, education and age subgroups will be repeated by the same subgroups in the future, given new distributions of race, education level and age among male and female population.