Museum & Visits Regional Economy Data & Statistics Careers Blog Press Center

Economic Heterogeneity Indicators:
Frequently Asked Questions

Why is the New York Fed conducting research on heterogeneity?
Developing the EHIs is consistent with the mission of the New York Fed to make the U.S. economy stronger and the financial system more stable for all segments of society. Understanding economic trends and impacts in different parts of the economy is important for designing effective monetary policy.

How do you define the region?
Due to data availability, we define the region as all counties (or zip codes, for our consumer spending analysis) belonging to the Federal Reserve Second District excluding Puerto Rico, the U.S. Virgin Islands, Warren County, NJ, and Fairfield County, CT, and additionally including Ocean County, NJ for our measures of inflation, employment, earnings, and consumer spending. For our small business analysis, we define the region as New York, New Jersey, and Connecticut.

How often are the EHIs updated?
The full set of EHIs are updated every three months, following the schedule posted on the EHI webpage.

Can we obtain the underlying data?
The EHI data are available for download.

What factors are used in calculating inflation for different demographic groups?
There are no demographic-specific official measures of inflation rates. However, the Bureau of Labor Statistics (BLS) computes separate inflation indexes (consumer price indexes, or CPIs) by metro area for different categories of goods and services, such as food, clothing, energy, housing, or entertainment. The BLS also conducts a Consumer Expenditure Survey (CEX), that enables one to see how different demographic groups allocate their spending to these different categories. For example, as the 2019 CEX shows, Black Americans spend relatively more on transportation and housing, and relatively less on food and entertainment than white Americans do. For each month, we use the CEX from the year prior, or the most recent CEX, whichever is available.

Using a procedure similar to several papers in the literature (Hobijn and Lagakos 2005, McGranahan and Paulson 2005, and Jaravel 2019), the EHIs assume that prices within each goods or service category are the same for everyone within a metro area and are well represented by the CPIs, but that different groups consume different amounts of goods and services from different categories. Data from the CEX on each demographic group’s budget shares for more than thirty categories of goods and services is combined with CPI data on inflation rates for these categories.

Estimates of the inflation of the consumption basket for each demographic group in that metro area are then obtained as a weighted average of the CPIs of the components of the consumption basket, with the weights being that group’s expenditure shares of the components. All the contributions are then weighted by the group’s population in each city to get an inflation index for the demographic group. This inflation measure is referred to as demo-CPI, which is the basis for statements about changing inflation gaps across demographics over time.

Is it fair to assume that prices are the same for everyone?
Prices are assumed to be the same for everyone within a metro area, but to vary across metro areas. It is likely that this approach underestimates inflation heterogeneities between different groups of Americans. That is because, in addition to consuming different bundles of goods, different demographics likely face different prices for the same goods, with lower-income Americans and Black Americans often facing higher price growth.

How do you measure heterogeneities in inflation across demographic categories (race, ethnicity, income, education, age)?
Data from the CEX on each demographic group’s budget shares for more than thirty categories of goods and services is combined with CPI data on inflation rates for these categories.

In an innovation to the literature, the CEX and CPI data are allowed to vary across twenty-three major U.S. metro areas, comprising nearly 40 percent of the U.S. population and ranging in size from St. Louis to New York City. CEX respondents not residing in one of these major metro areas are matched to the CPI of smaller cities and towns in their respective U.S. census region (Northeast, Midwest, South, or West).

How do you measure heterogeneities in inflation across geographies (census region, urban status)?
Unlike for the other heterogeneities explored in the economic heterogeneity indicators, the BLS provides estimates of the CPI by U.S. census region (Northeast, Midwest, South, and West). Therefore, they are used here to compute inflation differences between the census regions and the national average.

The methodology for assessing demographic categories—combining budget shares from the CEX with CPI inflation measures at the metro area and census region level—is used to compute the rural-urban inflation differential. However, an additional complication of the latter analysis is that while the CEX surveys both urban and rural households, the CPI only collects urban prices for each census region. Therefore, following Hobijn and Lagakos (2005), when measuring the inflationary experience of rural households, rural budget shares are used but not rural prices. Therefore the results on rural households carry a caveat, but they are worth reporting given the large and intuitive inflation disparity that is uncovered.

How does your methodology differ from the literature covering inflation differences observed in previous periods?
The approach used for the EHIs is an improvement on the procedure applied in previous literature, which uses national prices that are fixed across geography and demographics at a specific point in time. In contrast, here prices are allowed to vary across metro areas. This change has the advantage of assigning prices to households in a more accurate manner. An implicit assumption of this approach is that prices of goods are the same across demographic and income groups within a major metro area or for people outside of major metro area in the same census region, so that variation in inflation occurs only through differing consumption baskets and different location which is a much weaker assumption than the fixed price assumption used in the literature. The use of twice as many goods categories but no variation in inflation by metro area (that is, the use of national prices) was also explored, and produced broadly similar results.

How do you measure earnings and employment gaps in labor market outcomes between different demographic groups?
To understand earnings gaps, we compute ratios of average earnings of one group to another group. These ratios document earnings heterogeneity among workers who are employed and do not capture differential employment rates. 100 minus this ratio captures the earnings gap.

In particular, Black, Hispanic and AAPI earnings ratios are defined as the ratios of the earnings of workers in each specific racial and ethnic group to the earnings of white workers. The non-college earnings ratio is defined as the ratio of the earnings of workers without a bachelor’s degree to the earnings of workers with at least a bachelor’s degree. The women’s earnings ratio is defined as the ratio of women’s earnings to men’s earnings.

The race by gender earnings ratios are defined as the ratio of the earnings of the race by gender group in question to the earnings of white men. The rural earnings ratio is the ratio of earnings of rural workers to earnings of urban workers. Finally, the veteran earnings ratio is the ratio of earnings of veterans to the earnings of comparable nonveterans (comparable nonveterans are defined further on in the FAQ). Thus, for example, if the non-college earnings ratio is 55 percent, this means that workers without a bachelor’s degree earn 55 percent of what workers with a bachelor’s degree earn.

For employment, unemployment or labor force participation rate gaps, we take percentage point differences between the rates of one group and another group. Thus, the gender gap in any of these outcomes is defined as the outcome for men minus the outcome for women. The racial gaps are defined as the outcome for white workers minus the outcome for the given race or ethnicity. The college gap is the outcome for workers with a bachelor’s degree minus the outcome for workers without one. The rural gap is the outcome for urban workers minus the outcome for rural workers. The veterans’ gap is the outcome for veterans minus the outcome for comparable nonveterans. For example, if the gender employment gap is 11 percentage points, this means that men are 11 percentage points more likely to be employed than women are.

How do you measure the real and nominal earnings of different demographic groups (disability, veteran status, urban status, age, education, race, ethnicity, and business size)?
Monthly non-seasonally adjusted data on average weekly earnings are used for Asian, Black, Hispanic, and white workers aged sixteen and older from the Current Population Survey, a joint effort by the U.S. Census Bureau and the Bureau of Labor Statistics. Weekly earnings can vary because of changes in hourly wages or because of changes in hours worked per week. Similar results for racial and ethnic heterogeneity are obtained using hourly wages, so the findings here apply directly to the price of labor rather than to changes in hours worked.

Since the characteristics of the employed population change with the economy, changes to weekly earnings may reflect both changes in the composition of the employed pool as well as changes in the prices of particular skills. Nominal earnings are deflated by the EHIs’ demographic-specific inflation measures, although the results are similar if earnings are deflated using the CPI.

How do you define the population of veterans?
The 2019 five-year American Community Survey (ACS), the last one before the onset of the COVID-19 pandemic, is used to compute average outcomes for male veterans and nonveterans aged between 25 and 69. This cut of the data looks at the population of veterans who served when enlistment in the armed forces was voluntary, after the end of the draft in 1971.

It is a challenge to construct a comparison group since veterans differ from nonveterans among many dimensions. For example, veterans are overwhelmingly likely to be male high school graduates as the military typically requires a high school degree for service. Veterans are older, with enlistment rates drifting down over time. They are also more likely to be native-born and white, and more likely to have been born in the South and the Midwest than in the Northeast and the West.

Therefore, to build a more comparable comparison group for veterans, the population of nonveteran male high school graduates is weighted to match the age, racial, ethnic, immigrant and geographic distributions of veterans. The fractions of the male high school graduate population in each age, race, origin, and geography category who are veterans are used as weights. This control group is referred to as “comparable nonveterans.”

Although this methodology does not remove all sources of differences between veterans and “reweighted” nonveterans (for example, the veterans may differ from nonveterans in other aspects of their background, or in unobservable characteristics such as personality or interests, for which there is no data in the ACS), it avoids the most obvious sources of noncomparability between them and allows us to focus on the consequences of being a veteran.

How do you define business size?
Throughout the EHIs, “businesses” refer to “firms,” as defined in the Quarterly Workforce Indicators (QWI). Firm size is based on the “firm’s national employment on March 12th of the previous year (current year for new firms).”

Why do you report on the Employment to Population Ratio (EPOP)?
The EHIs report on the EPOP for prime-aged workers, as reported by BLS, as a measure of the state of the labor market for a given group. This is the ratio of the number of people aged 25 to 54 in each group who are employed (including self-employed) to the total number of people in that age bracket in that group. An alternative measure could be the unemployment rate; however, it captures only people who are not employed but are looking for work and misses people who currently have given up looking for work but may return to work when economic conditions improve. The Federal Reserve System defines its labor market half of the dual mandate as “maximum employment,” consistent with making EPOP a key labor market indicator.

An extensive literature suggests that these people are an important part of labor market dynamics. In contrast, EPOP accounts for people dropping out of and then returning to the labor force for economic reasons. A possible problem with EPOP computed for the entire adult population—as well as with other labor market measures—could be if, over time, people spend more time getting an education or retire early, or if the age composition of the population changes. Therefore, the EHIs consider EPOP only for prime-aged workers—those between 25 and 54—who typically have completed their education and would work under ordinary circumstances. Indeed, this age group tends to be strongly connected to employment; on the eve of the arrival of COVID-19 in February 2020, this group had an EPOP ratio of more than 80 percent.

How do you define the unemployment rate?
The BLS’s definition of the unemployment rate is used. It counts as unemployed anyone who 1) does not have a job, and 2) mentioned that they are looking for a job. This corresponds to the U3 definition of unemployment used by the BLS. In particular, those who are employed part-time for economic reasons (that is they would be willing to work full-time but cannot find a full-time job) are not counted as unemployed.

Why do you report the labor force participation rate?
The labor force participation (LFP) rate is reported because, along with the unemployment rate, it is a critical contributor to the employment rate. During business cycles, people routinely leave the labor market and return to it subsequently, affecting the LFP rate without affecting the unemployment rate. During the COVID-19 pandemic, many interesting hypotheses (for example that the near-elderly might retire in greater numbers, or that women may stay at home to a greater extent than before) had implications specifically for labor force participation, and differences in labor force participation are particularly important in explaining differences in employment by race and by gender.

How do you define the population of those with disabilities?
The EHIs use the “any difficulty” indicator in the Current Population Survey (CPS) to identify the population of those with disabilities. Specifically, “any difficulty” indicates an affirmative response to having one or more of the six physical or cognitive difficulties measured by the CPS. These six difficulties are difficulty hearing, seeing, remembering, walking or climbing stairs, taking care of personal needs, or performing basic tasks outside the home alone.

What is the underlying data for the wealth EHIs?
The Distributional Financial Accounts (DFA), which provide quarterly estimates of the distribution of U.S. household wealth, incorporate data from both the Survey of Consumer Finances (SCF) and the Financial Accounts of the United States. The DFA, the SCF, and the Financial Accounts are all published by the Board of Governors of the Federal Reserve.

How do you define per household wealth?
Wealth is defined as net worth, or assets less liabilities. The total wealth of a group is divided by the number of households in that group according to the SCF to yield wealth per household.

How are the wealth numbers interpolated quarterly if the SCF numbers are triennial?
The DFA takes information on demographic distributions from the SCF, but the SCF only publishes every three years. Therefore the DFA use sophisticated imputation methods to interpolate distributions of household financial information into the quarters between SCF releases. These interpolated distributions are used in the DFA quarters for which no SCF is available.

How do you measure consumer spending by demographic and geographic groups (income, education, age, race, and urban status)?
Detailed receipt-level transaction data provided by Numerator, a market research firm, is used to capture consumer retail spending. Numerator captures spending by elective receipt uploads and permissioned email-scraping of a balanced panel of 200,000 households. Spending is observed in the same categories as in the Census Advance Monthly and Monthly Retail Trade Surveys (MARTS) releases and aligns well with national retail sales numbers found in MARTS. Numerator’s data collection approach creates a panel that is aligned demographically and geographically to the U.S. Census. Person-level Numerator data is used to identify the differences in consumer spending across households of different incomes, education levels, ages, races, and urban status.

How is retail defined in your consumer spending measures?
For the national consumer spending report, we define retail as retail sales excluding automobiles (ex auto), as is typical in this literature, due to the high volatility of auto sales. For the region, we further restrict retail to retail sales excluding automobiles and nonstore (ex auto ex nonstore) purchases, in accordance with how Monthly Retail Sales from the Advance Monthly Retail Trade Survey (MARTS) data tracks retail ex auto ex nonstore at the state and local level.

How do you calculate the retail spending measures?
Numerator aggregates receipt-level transaction data by month, spending category, and demographic. They then report the monthly average spending by spending category and demographic grouping. We then deflate these measures using our good-specific inflation indices for each demographic and adjust the measures for seasonality. These values are used to calculate year-over-year and cumulative growth rates, which are reported in the EHIs.

What are the different demographic and geographic groups for which you can observe the spending data?
Numerator provides aggregate spending at a variety of levels of demographic and geographic granularity. We observe households by different levels of income, education, age, race/ethnicity, urban status, and census region and division.

How do you calculate demographic-good deflators?
Demographic-good deflators are computed by taking a weighted average of the city-level prices of the goods comprising the goods category (for instance, all goods comprising the retail category), with the weights being the demographic-specific budget shares of the goods in that city, and averaging deflators across cities using the populations of the demographic group in question as weights.

How are “small businesses” defined?
The Federal Reserve’s Small Business Credit Survey (SBCS) covers a sample of firms with fewer than 500 employees. In survey years 2019-24, the median business has six employees.

Is the Small Business Credit Survey (SBCS) sample representative of U.S. small firms?
The SBCS is not a random sample. Instead, the survey relies on an outreach-based recruitment process: partner organizations (such as chambers of commerce, small business support groups, etc.) distribute the online survey to businesses in their networks, and the Federal Reserve also invites firms through select email lists as well as prior SBCS participants. Because who receives and responds to those invitations is shaped by network coverage and willingness to participate, the SBCS is best described as an “influence sample” rather than a randomly drawn sample.

How do you adjust the sample to make it more comparable to the national small business population?
To make results more comparable to the national small business population, the SBCS team applies survey weights to the variables so that, after weighting, the distribution of responding firms matches U.S. Census Bureau benchmarks on key dimensions. Firms that are underrepresented in the raw responses receive higher weights, and overrepresented firms receive lower weights. The weighting aligns to characteristics such as industry, geography (including urban/rural), firm age, and owner demographics (race/ethnicity and gender); for employer firms, weights also incorporate firm size (number of employees), and employer and nonemployer firms are handled separately. The weights improve representativeness but do not substitute for a true random sample.

Why does the sample exclude “nonemployer” firms?
Although the SBCS questionnaire covers both employers and nonemployers (that is, a firm with no employees other than its owner), we restrict this analysis to firms with at least one paid employee other than their owner(s). One reason is that we are focused on employment growth. More generally, nonemployer firms have business models that are distinct from those of employer firms. For more about this issue, see this report.

Why is profitability information only available for the year before the survey?
The questionnaire asks businesses to report their profitability “at the end of last year.” For example, respondents in the 2025 survey report their profit levels as of the end of 2024. For consistency with the other questions (which refer to the survey year), we assign each profitability response to the year it refers to, rather than the year in which the firm was surveyed.

How should revenue and employment changes be interpreted?
The SBCS asks respondents whether revenues and employment were higher, lower, or flat relative to the previous year. These changes may not correspond to sample-wide changes in revenues or employment levels. For example, even if the proportion of businesses reporting higher revenues increases, total revenues may have fallen in the same time period if the amount of revenue lost by the unprofitable firms exceeds that gained by the profitable firms.

What is a “diffusion index”?
A diffusion index is a summary statistic that captures the direction of change in a survey with multiple-choice responses. It is calculated by subtracting the percentage of firms reporting a decrease in a variable from the percentage of firms reporting an increase in the same variable. Firms responding “no change” have no impact on the diffusion index.

Are there additional small business outcomes in the survey?
Yes, we have only reported a subset of the outcome variables covered in the survey. For example, there are other questions on small bank finances that we have not covered. For comprehensive reports from the SBCS, visit https://www.fedsmallbusiness.org/reports/survey.

Economic analysis often focuses on understanding the average effects of a policy or program. However, it is vital to study how the economic trends and economic effects of policies vary across demographic, geographic and socioeconomic boundaries to understand their impacts on the macroeconomy. Analysis of the New York Fed EHIs helps bring a deeper understanding of economic growth considerations to policymaking, research,
and practice.

The EHIs are updated at or shortly after 10 a.m. on the dates posted on the EHI webpage.

The EHIs are not official estimates of the Federal Reserve Bank of New York, its President, the Federal Reserve System, or the Federal Open Market Committee.

By continuing to use our site, you agree to our Terms of Use and Privacy Statement. You can learn more about how we use cookies by reviewing our Privacy Statement.

Economic Heterogeneity Indicators: Frequently Asked Questions

Economic Heterogeneity Indicators:
Frequently Asked Questions