Survey Methods and Reliability Statement for the May 2012 Occupational Employment Statistics Survey
- Occupational and industrial classification systems
- 3-year survey cycle of data collection
- Sampling procedures
- Response and nonresponse
- Estimation methodology
- Variance estimation
- Reliability of the estimates
- Data presentation
The Occupational Employment Statistics (OES) survey is primarily a mail survey measuring occupational employment and wage rates for wage and salary workers in nonfarm
establishments nationally, and in the 50 states and the District of Columbia, Guam, Puerto Rico, and the Virgin Islands.
About 6.7 million in-scope establishments are stratified within their respective states by substate area, industry, and ownership. Substate areas include all officially
defined metropolitan areas and one or more nonmetropolitan areas. The North American Industry Classification System (NAICS) is used to stratify establishments by industry.
Probability sample panels of about 200,000 establishments are selected semiannually. Most responses are obtained through mail with the remaining responses collected by
telephone, e-mail, internet, or other electronic means, or personal visit. Respondents report their number of employees by occupation across 12 wage ranges. The Standard
Occupational Classification (SOC) system is used to define occupations.
Estimates of occupational employment and occupational wage rates are based on six panels of survey data collected over a 3-year cycle. The final in-scope sample size
when six panels are combined is approximately 1.2 million establishments. Total 6-panel unweighted employment covers approximately 76 million of the total employment of 128 million.
Occupational and industrial classification systems
The occupational classification system
The U.S. Office of Management and Budgetís Standard Occupational Classification (SOC) system is used to define occupations. All six panels were collected using the 2010 SOC
system. More information about the SOC system can be found at www.bls.gov/soc/.
The industrial classification system
The 2012 OES survey estimates use the 2012 North American Industry Classification System (NAICS). More information about NAICS can be found at the BLS web site
www.bls.gov/bls/naics.htm or in the 2012 North American Industry Classification System manual. Each establishment in the survey is assigned a
6-digit NAICS code based on its primary economic activity.
Industrial scope and stratification
The survey covers the following NAICS industry sectors:
- 11—Logging (1133), support activities for crop production (1151), and support activities for animal production (1152) only
- 42—Wholesale trade
- 44-45—Retail trade
- 48-49—Transportation and warehousing
- 52—Finance and insurance
- 53—Real estate and rental and leasing
- 54—Professional, scientific, and technical services
- 55—Management of companies and enterprises
- 56—Administrative and support and waste management and remediation services
- 61—Educational services
- 62—Health care and social assistance
- 71—Arts, entertainment, and recreation
- 72—Accommodation and food services
- 81—Other services, except public administration [private households (814) are excluded]
- Federal government executive branch (assigned industry code 999100)*
- State government (assigned industry code 999200)*
- Local government (assigned industry code 999300)*
* These are OES-defined industry codes and not part of the NAICS industry classification.
These sectors are stratified into about 345 industry groups at the 4-, 5-, or 6-digit NAICS level of detail.
An establishment is generally a single physical location at which economic activity occurs (e.g., store, factory, restaurant, etc.). Each
establishment is assigned a 6-digit NAICS code. When a single physical location encompasses two or more distinct economic activities, it is treated as two or more separate
establishments if separate payroll records are available and certain other criteria are met.
Employment refers to the number of workers who can be classified as full- or part-time employees, including workers on paid vacations or other
types of paid leave; salaried officers, executives, and staff members of incorporated firms; employees temporarily assigned to other units; and noncontract employees for whom
the reporting unit is their permanent duty station regardless of whether that unit prepares their paychecks.
The OES survey includes all full- and part-time wage and salary workers in nonfarm industries. Self-employed workers, owners and partners in unincorporated firms, household
workers, and unpaid family workers are excluded.
Occupations are classified based on work performed and on required skills. Employees are assigned to an occupation based on the work they perform
and not on their education or training. For example, an employee trained as an engineer but working as a drafter is reported as a drafter. Employees who perform the duties of
two or more occupations are reported in the occupation that requires the highest level of skill or in the occupation where the most time is spent if there is no measurable
difference in skill requirements. Working supervisors (those spending 20 percent or more of their time doing work similar to the workers they supervise)
are classified with the workers they supervise. Workers receiving on-the-job training, apprentices, and trainees are classified with the occupations
for which they are being trained.
A wage is money that is paid or received for work or services performed in a specified period. Base rate pay, cost-of-living allowances, guaranteed
pay, hazardous-duty pay, incentive pay such as commissions and production bonuses, and tips are included in a wage. Back pay, jury duty pay, overtime pay, severance pay, shift
differentials, nonproduction bonuses, employer costs for supplementary benefits, and tuition reimbursements are excluded. Federal government, the U.S. Postal Service (USPS), and
some states report individual wage rates for workers. Other employers are asked to classify each of their workers into one of the following 12 wage intervals:
Table 1. OES survey wage intervals
|$9.25 to $11.49
||$19,240 to $23,919
|$11.50 to $14.49
||$23,920 to $30,159
|$14.50 to $18.24
||$30,160 to $37,959
|$18.25 to $22.74
||$37,960 to $47,319
|$22.75 to $28.74
||$47,320 to $59,799
|$28.75 to $35.99
||$59,800 to $74,879
|$36.00 to $45.24
||$74,880 to $94,119
|$45.25 to $56.99
||$94,120 to $118,559
|$57.00 to $71.49
||$118,560 to $148,719
|$71.50 to $89.99
||$148,720 to $187,199
|$90.00 and over
||$187,200 and over
3-year survey cycle of data collection
The survey is based on a probability sample drawn from a universe of about 6.7 million in-scope establishments stratified by geography, industry, size, and ownership. The
sample is designed to represent all nonfarm establishments in the United States.
The OES survey allocates and selects a sample of approximately 200,000 establishments semiannually. Semiannual samples are referred to as panels. To the extent possible,
private sector units selected in any one panel are not sampled again in the next five panels.
The survey is conducted over a rolling 6-panel (or 3-year) cycle. This is done in order to provide adequate geographic, industrial, and occupational coverage. Over the
course of a 6-panel (or 3-year) cycle, approximately 1.2 million establishments are sampled. In this cycle, data collected in May 2012 are combined with data collected in
November 2011, May 2011, November 2010, May 2010, and November 2009.
For a given panel, survey questionnaires are initially mailed out to almost all sampled establishments. State workforce agency staff may make personal visits to some of the
larger establishments; however, these are limited due to cost and time constraints. Three additional mailings are sent to nonrespondents at approximately 4-week intervals.
Telephone or e-mail follow-ups are made to nonrespondents.
Censuses of federal and state government are collected annually.
- A census of the executive branch of the federal government and the U.S. Postal Service (USPS) is collected annually in June from the U.S. Office of Personnel
Management (OPM), the Tennessee Valley Authority, and the U.S. Postal Service. Data from only the most recent year are retained for use in OES estimates.
- In each area, a census of state government establishments, except for schools and hospitals, is collected annually every November. Data from only the most
recent year are retained for use in OES estimates.
A probability sample is taken of local government establishments, private establishments, and state schools and hospitals.
The sampling frame, or universe, is a list of about 6.7 million in-scope nonfarm establishments that file unemployment insurance (UI) reports to the state workforce agencies.
Employers are required by law to file these reports to the state where each establishment is located. Every quarter, BLS creates a national sampling frame by combining the
administrative lists of unemployment insurance reports from all of the states into a single database called the Quarterly Census of Employment and Wages (QCEW). Every six months,
OES extracts the administrative data for establishments that are in scope for the OES survey from the most current QCEW. QCEW files were supplemented with frame files covering Guam
and rail transportation (NAICS 4821) because these establishments are not covered by the UI program.
Construction of the sampling frame includes a process where establishments that are linked together into multiunit companies are assigned to either the May or November sample.
This prevents BLS from contacting multiunit companies more than once per year. Furthermore, the frame is matched to the 5 prior sample panels, and units that have been previously
selected in the 5 prior panels are marked as ineligible for sampling for the current panel.
Establishments on the frame are stratified by geographic area and industry group.
- Geography—629 Metropolitan Statistical Areas (MSAs), metropolitan divisions, and nonmetropolitan or balance-of-state (BOS) areas are specified. MSAs and
metropolitan divisions are defined and mandated by the Office of Management and Budget. Each officially defined metropolitan area within a state is specified as a
substate area. Cross-state MSAs have a separate portion for each state contributing to that MSA. In addition, states may have up to six residual nonmetropolitan
areas that together cover the remaining non-MSA portion of their state.
- Industry—344 industry groups are defined at the NAICS 4-, 5-, or 6-digit level.
- Ownership—Schools are also stratified by state government, local government, or private ownership. Also, local government casino and gambling hotels are sampled
separately from the rest of local government.
- Size—Establishments are divided into certainty and noncertainty size classes.
At any given time, there are about 175,000 nonempty State/MSA-BOS/NAICS 4-, 5-, 6-digit/ownership strata on the frame. When comparing nonempty strata between frames, there may be
substantial frame-to-frame differences. The differences are due primarily to normal establishment birth and death processes and normal establishment growth and shrinkage. Other
differences are due to NAICS reclassification and changes in geographic location.
A small number of establishments indicate the state in which their employees are located, but do not indicate the specific county in which they are located. These establishments
are also sampled and used in the calculation of the statewide and national estimates. They are not included in the estimates of any substate area. Therefore, the sum of the employment
in the MSAs and nonmetropolitan areas within a state may be less than the statewide employment.
Allocation of the sample to strata
The frame is stratified into approximately 175,000 nonempty State/MSA-BOS/NAICS 4-, 5-, 6-digit/ownership strata. Each time a sample is selected, a 6-panel allocation of the 1.2
million sample units among these strata is performed. The largest establishments are removed from the allocation because they will be selected with certainty once during the 6-panel
cycle. For the remaining noncertainty strata, a set of minimum sample size requirements based on the number of establishments in each cell is used to ensure coverage for industries
and MSAs. For each State/MSA-BOS/NAICS 4-, 5-, 6-digit/ownership stratum, a sample allocation is calculated using a power Neyman allocation. The actual 6-panel sample allocation is
the larger of the minimum sample allocation and the power allocation. To determine the current single panel allocation, the 6-panel allocation is divided by 6 and the resulting
quotient is randomly rounded.
Two factors influence the power Neyman allocation. One is the square root of the employment size of each stratum. With a Neyman allocation, strata with higher levels of employment
generally are allocated more sample than strata with lower levels of employment. Using the square root within the Neyman allocation softens this effect. The other is a measure of the
occupational variability of the industry. The occupational variability of an industry is measured by computing the coefficient of variation (CV) for each occupation within the 90th
percentile of occupational employment in a given industry, averaging those CVs, and then calculating the standard error from that average CV. Using this measure, industries that tend
to have greater occupational variability will get more sample than industries that are more occupationally homogeneous.
Sample selection within strata is approximately proportional to size. In order to provide the most occupational coverage, establishments with higher employment are more likely to
be selected than those with lower employment; some of the largest establishments are selected with certainty. The unweighted employment of sampled establishments makes up approximately
59.3 percent of total employment.
Permanent random numbers (PRNs) are used in the sample selection process. To minimize sample overlap between the OES survey and other large surveys conducted by the U.S. Bureau of
Labor Statistics, each establishment is assigned a PRN. For each stratum, a specific PRN value is designated as the "starting" point to select a sample. From this "starting" point, we
sequentially select the first Ďní eligible establishments in the frame into the sample, where Ďní denotes the number of establishments to be sampled.
Single panel weights (sampling weights)
Sampling weights are computed so that each panel will roughly represent the entire universe of establishments.
Federal government, USPS, and state government units are assigned a panel weight of 1. Other sampled establishments are assigned a design-based panel weight, which reflects the
inverse of the probability of selection.
National sample counts
The combined sample for the May 2012 survey is the equivalent of six panels. The sample allocations excluding federal government and U.S. Postal Service (USPS) for the panels in this
- 202,144 establishments for May 2012
- 199,990 establishments for November 2011
- 201,275 establishments for May 2011
- 201,553 establishments for November 2010
- 200,282 establishments for May 2010
- 203,546 establishments for November 2009
The May 2012 sample includes 8,148 federal and USPS units. The combined sample size for the May 2012 estimates is approximately 1.2 million establishments, which includes only the
most recent data for federal and state government. Federal and state government units from older panels are deleted to avoid double counting.
Response and nonresponse
Of the approximately 1.2 million establishments in the combined initial sample, 1,114,359 were viable establishments (that is, establishments that are not outside the scope or out
of business). Of the viable establishments, 853,801 responded and 260,558 did not—a 76.6 percent response rate. The response rate in terms of weighted sample employment is 72.9 percent.
Nonresponse is a chronic problem in virtually all large-scale surveys because it may introduce a bias in estimates if the nonrespondents tend to differ from respondents in terms of
the characteristic being measured. To partially compensate for nonresponse, the missing data for each nonrespondent are imputed using plausible data from responding units with similar
Establishments that do not report occupational employment data are called "unit" nonrespondents. Establishments that report employment data but fail to report some or all the
corresponding wages are called "partial" nonrespondents. Missing data for unit nonrespondents are imputed through a two-step imputation process. Missing data for partial nonrespondents are
imputed through the second step of the process only.
Step 1, Impute an occupational employment staffing pattern
For each unit nonrespondent, a staffing pattern is imputed using a nearest-neighbor "hot deck" imputation method. The procedure links a responding donor establishment to each nonrespondent.
The nearest-neighbor hot deck procedure searches within defined cells for a donor that uses the same occupational classification system and most closely resembles the nonrespondent by geographic
area, industry, and employment size. Ownership is also used in the hospital and education industries. The procedure initially searches for a donor whose reported employment is approximately the
same as the nonrespondentís frame employment within the same 5- or 6-digit NAICS, state, and ownership. If more than one otherwise equally qualified donor is found, a donor from a more recent
panel will be selected over a donor from an older panel. If the search is unsuccessful, the pool of donors is enlarged in incremental steps by expanding geographic area and industry until a
suitable donor is found. Limits are placed on the number of times a donor can be used.
After a donor has been found, its occupational staffing pattern is used to prorate the nonrespondentís frame employment by occupation. The prorated employment is the nonrespondentís
imputed occupational employment.
Step 2, Impute an employment distribution across wage intervals
For each "unit" nonrespondent in step 1 or for each "partial" nonrespondent, impute an employment distribution across wage intervals for occupations without complete wage data. This
distribution, called the wage employment distribution, is imputed as follows:
- Identify the imputation cell for each of the nonrespondentís occupations. Imputation cells are initially defined by MSA/BOS, NAICS 5/6 and size class from the most recent
panel only. For schools and hospitals, cells are further divided by ownership.
- Determine if the imputation cell has enough respondents to compute wage employment distributions. If not, incrementally enlarge the cell until there are enough
- Use the distributions above to prorate the nonrespondentís imputed occupational employment across wage intervals. (Or, for partial respondents, use the distributions above
to prorate the reported occupational employment across wage intervals.)
This section describes the weighting methodology and formulas used for making the estimates. Each semiannual sample represents roughly one-sixth of the establishments for the full 6-panel
sample plan and is used in conjunction with the previous five semiannual samples in order to create a combined sample of approximately 1.2 million establishments, which includes only the most
recent data for federal and state government.
Reweighting for the combined sample
Employment and wage rate estimates are computed using a rolling 6-panel (3-year) sample. Estimates for the May 2012 survey were calculated using data from the May 2012, November 2011,
May 2011, November 2010, May 2010, and November 2009 samples. Establishments from each panelís sample are initially assigned weights as if one panel were being used to represent the entire
population. When the samples are combined, each sampled establishment must be reweighted so that the aggregated sample across six panels represents the entire population. Establishments
selected with certainty in the 6-panel cycle are given a weight equal to 1. Noncertainty units are reweighted stratum-by-stratum. This revised weight is called the 6-panel combined sample
weight. The original single-panel sampling weights are computed so that responses in a stratum could be weighted to represent the entire stratum population. In one common scenario, six panel
samples are combined, and all six panels have sample units for a particular stratum. A summation of the single-panel weights would over-represent the population by a factor of six. Because we
do not want to over-represent the stratum population, the 6-panel combined sample weight of each establishment is set equal to 1/k times its single-panel sampling weight. In general, when six
panel samples are combined, a count of the number of panels with at least one unit selected for a given stratum is assigned to k.
Benchmarking to QCEW employment
A sum of ratio-adjusted weighted reported occupational employment is used to calculate estimates of occupational employment. The auxiliary variable for the estimator is the average of the
latest May and November employment totals from the Bureauís Quarterly Census of Employment and Wages (QCEW). For the May 2012 survey, the auxiliary variable is the average of May 2012 and
November 2011 employment. In order to balance the state need for estimates at differing levels of geography and industry, the ratio estimation process is carried out through a series of four
hierarchical employment ratio adjustments. The ratio adjustments are also known as benchmark factors (BMFs).
The first of the hierarchical benchmark factors is calculated for cells defined by state, MSA/BOS, NAICS 4/5/6, and employment size class (4 size classes: 1-19, 20-49, 50-249, 250+). For
establishments in the hospital and education industries (NAICS 622 and 611), the first hierarchical factor is calculated for cells defined by state, MSA/BOS, NAICS 4/5/6, employment size class
(4 size classes: 1-19, 20-49, 50-249, 250+), and ownership (state government, local government, or privately owned). If a first-level BMF is out of range, it is reset to a maximum (ceiling) or
minimum (floor) value. First-level BMFs are calculated as follows:
- h = MSA/BOS by NAICS 4/5/6
- H = state by NAICS 4/5/6
- s = employment size classes (1-19, 20-49, 50-249, 250+)
- S = aggregated employment size classes (1-49, 50+)
- o = ownership (state government, local government, privately owned)
- M = average of May and November QCEW employment
- wi = six-panel combined sample weight for establishment i
- xi = total establishment employment
- BMFmin = a parameter, the lowest value allowed for BMF
- BMFmax = a parameter, the highest value allowed for BMF
Second-level BMFs are calculated for cells defined at the state, NAICS 4/5/6 level by summing the product of combined 6-panel weight and first-level BMF for each establishment in the cell.
For establishments in the hospital and education industries (NAICS 622 and 611), the first hierarchical of the second-level BMK factor is calculated at the state, NAICS 4/5/6 and ownership level.
Second-level BMFs account for the portion of universe employment that is not adequately covered by weighted employment in first-level benchmarking. Inadequate coverage occurs when
"MSA/BOS | NAICS 4/5/6 | size class" cells have no sample data or when a floor or ceiling is imposed on first-level BMFs. Second-level benchmarks are calculated as follows:
Third-level BMFs (BMF3,H) are calculated at the "State | 3-digit NAICS" cell level by summing the product of combined 6-panel weight, first-level BMF, and second-level BMF for each
establishment in the cell. The third-level BMF also benchmarks by ownership for the hospital and education industries. Fourth-level BMFs (BMF4,H) are calculated at the
"State | 2-digit NAICS" cell level by summing the product of final weight, first-level BMF, second-level BMF, and third-level BMF for each establishment in the cell. The fourth-level BMK factor
does not benchmark by ownership. As with second-level BMFs, third- and fourth-level BMFs are computed to account for inadequate coverage of the universe employment.
A final benchmark factor, BMFi, is calculated for each establishment as the product of its four hierarchical benchmark factors
(BMFi = BMF1 * BMF2 * BMF3 * BMF4). A benchmark weight value is then calculated as the product of the establishmentís six-panel
combined sample weight and final benchmark factor.
Occupational employment estimates
Benchmark factors and the combined 6-panel weights are used to compute estimates of occupational employment. Estimates are produced for cells defined by geographic area, industry group, and
size of establishment (i.e., size class). The total employment for an occupation in a cell is estimated by taking the product of the reported occupational employment, the 6-panel combined sample
weight, and the final benchmark factor for each establishment in the cell, and summing the product across all establishments in the cell. This sum is the estimate of total occupational employment
in the cell.
The equation below is used to calculate occupational employment estimates for an estimation cell defined by geographic area, industry group, and size class.
- o = occupation
- h = estimation cell
- wi = six-panel combined sample weight for establishment i
- BMFi = final benchmark factor for establishment i
- xio = employment for occupation o in establishment i
- ho = estimated employment for occupation o in cell h
Wage rate estimation
Two externally derived parameters are used to calculate wage rate estimates. They are:
- the mean wage rates for each of the 12 wage intervals and
- wage updating factors (also known as aging factors)
Wage rates of workers are reported to the OES survey as grouped data across 12 consecutive, nonoverlapping wage bands. Individual wage rates are collected and used for federal government
and U.S. Postal Service workers. State government may report their data as either individual wage rates or interval wage rates.
An establishment employs 10 secretaries at the following wage rates:
- $9/hour — 1 secretary
- $10/hour — 1 secretary
- $12/hour — 2 secretaries
- $13/hour — 2 secretaries
- $14/hour — 2 secretaries
- $16/hour — 1 secretary
- $17/hour — 1 secretary
Wage rates for secretaries, however, are reported to the OES survey as follows:
- Wage interval A (under $9.25/hour) — 1 secretary
- Wage interval B ($9.25-$11.49/hour) — 1 secretary
- Wage interval C ($11.50-$14.49/hour) — 6 secretaries
- Wage interval D ($14.50-$18.24/hour) — 2 secretaries
The remaining wage intervals have 0 secretaries.
Because wage rates are collected as grouped data, we must use grouped data formulas to calculate estimates of mean and percentile wage rates. Assumptions are made when using grouped
data formulas. For the mean wage rate formula, we assume that we can calculate the average wage rate for workers in each interval. For the percentile wage rate formula, we assume that
workers are evenly distributed in each interval.
Wage data from the following panels — May 2012, November 2011, May 2011, November 2010, May 2010, and November 2009 — were used to calculate May 2012 wage rate estimates. Wage data
from different panels, however, are not equivalent in real-dollar terms due to inflation and changing compensation costs. Consequently, wage data collected prior to the current survey
reference period have to be updated or aged to approximate that period.
Determining a mean wage rate for each interval
The mean hourly wage rate for all workers in any given wage interval cannot be computed using grouped data collected by the OES survey. This value is calculated externally using data
from the Bureauís National Compensation Survey (NCS). Although smaller than the OES survey in terms of sample size, the NCS program, unlike OES, collects individual wage data for private
sector and state and local government employees. With the exception of the highest wage interval, mean wage rates for each panel are calculated using NCS data for the panel's previous
reference year, since this is the latest data available. The lower boundary of the highest wage interval was $90.00. The mean hourly wage for this interval was calculated using the average
of the 2009, 2010, and 2011 NCS data. The mean hourly wage rate for interval L (the upper, open-ended wage interval) is calculated without wage data for pilots. This occupation is excluded
because pilots work fewer hours than workers in other occupations. Consequently, their hourly wage rates are much higher.
Wage aging process
Aging factors are developed from the Bureauís Employment Cost Index (ECI) survey. The ECI survey measures the rate of change in compensation for ten major occupation groups on a
quarterly basis. The eleventh, open-ended, interval is not aged. Aging factors are used to adjust OES wage data in past survey reference periods to the current survey reference period
(May 2012). The procedure assumes that there are no major differences by geography, industry, or detailed occupation within the occupational division.
Mean hourly wage rate estimates
For data from local government, private, and certain state government establishments, the mean hourly wage is the total weighted hourly wages for an occupation divided by its weighted
survey employment. Estimates of mean hourly wages are calculated using a standard grouped data formula that was modified to use ECI aging factors.
- o = occupation
- o = mean hourly wage rate for occupation o
- z = panel (or year)
- t = current panel
- wi = six-panel combined sample weight for establishment i
- BMFi = final benchmark factor applied to establishment i
- io = unweighted total hourly wage estimate for occupation o in establishment i
- r = wage interval
- o = estimated employment for occupation o
- xior = reported employment for occupation o in establishment i in wage interval r
(note that establishment i reports data for only one panel z or one year z)
- uzo = ECI aging factor for panel (or year) z and occupation o
- czr = mean hourly wage for interval r in panel (or year) z
In this formula, czr represents the mean hourly wage of interval r in panel (or year) z. The mean is computed externally using data from the
Bureauís NCS survey.
For data from federal and certain state government establishments that provide individual wage rates, the hourly wages for an occupation within an establishment are summed to get total wages.
Employment for that occupation within that establishment is also summed to get total employment. The total wages and total employment across all establishments in the occupation for the
estimation level of interest are summed.
Mean Wage = (Total Interval Wages + Total Individual Wages) / (Total Interval Employment + Total Individual Employment)
Percentile hourly wage rate estimates
The p-th percentile hourly wage rate for an occupation is the wage where p percent of all workers earn that amount or less and where (100-p) percent of all workers earn that amount or more.
The wage interval containing the p-th percentile hourly wage rate is located using a cumulative frequency count of estimated employment across all wage intervals. After the targeted wage
interval is identified, the p-th percentile wage rate is then estimated using a linear interpolation procedure. This statistic is calculated by first distributing federal, state, local government,
and private workers inside each wage interval. Federal and certain state government workers are distributed throughout the wage intervals according to their wage rates, while certain state
government, local government, and private workers are distributed uniformly within each wage interval. Next, workers are ranked from lowest paid to highest paid. Finally, the product of the total
employment for the occupation and the desired percentile is calculated to determine the worker that earns the p-th percentile wage rate.
- pRo = p-th percentile hourly wage rate for occupation o
- r = wage interval that encompasses pRo
- Lr = lower bound of wage interval r
- Ur = upper bound of wage interval r
- fr = number of workers in interval r
- j = difference between the number of workers needed to reach the p-th percentile wage rate and the number of workers needed to reach the Lr wage rate
Annual wage rate estimates
These estimates are calculated by multiplying mean or percentile hourly wage rate estimates by a "year-round, full time" figure of 2,080 hours (52 weeks x 40 hours) per year. These estimates,
however, may not represent mean annual pay should the workers work more or less than 2,080 hours per year.
Alternatively, some workers are paid based on an annual basis but do not work the usual 2,080 hours per year. For these workers, survey respondents report annual wages. Since the survey does
not collect the actual number of hours worked, hourly wage rates cannot be derived from annual wage rates with any reasonable degree of confidence. Only annual wages are reported for some occupations.
Occupational employment variance estimation
A subsample replication technique called the "jackknife random group" is used to estimate variances of occupational employment. In this technique, each sampled establishment is assigned to one
of G random groups. G subsamples are created from the G random groups. Each subsample is reweighted to represent the universe.
G estimates of total occupational employment (hjog)
(one estimate per subsample) are calculated. The variability among the G employment estimates is a good variance estimate for occupational employment. The two formulas below are used to estimate the
variance of occupational employment for an estimation cell defined by geographic area and industry group.
- h = estimation cell defined by geographic area and industry group
- j = employment size class (1-19, 20-49, 50-249, 250+)
- o = occupation
- v(hjo) = estimated variance of hjo
- G = number of random groups
- hjo = estimated employment of occupation o in cell h and size class j
- hjog = estimated employment of occupation o in cell h, size class j, and subsample g
- hjo = estimated mean employment for occupation o in cell h and size class j based on the G subsamples
(Note: a finite population correction factor is applied to the terms hjog and hjo.)
The variance for an occupational employment estimate in cell h is obtained by summing the variances v(hjo)
across all size classes j in the cell.
Occupational mean wage variance estimates
Because the OES wage data are collected in intervals (grouped), we do not capture the exact wage of each worker. Therefore, some components of the wage variance are approximated using factors developed
from NCS data. A Taylor Series Linearization technique is used to develop a variance estimator appropriate for OES mean wage estimates. The primary component of the mean wage variance, which accounts for
the variability of the observed sample data, is estimated using the standard estimator of variance for a ratio estimate. This component is the first term in the formula given below:
- o = estimated mean wage for occupation o
- v(o) = estimated variance of o
- o = estimated occupational employment for occupation o
- h = stratum (area/industry/size class)
- fho = sampling fraction for occupation o in stratum h
- nho = number of sampled establishments that reported occupation o in stratum h
- wi = six-panel combined sample weight for establishment i
- BMFi = final benchmark factor applied to establishment i
- qio = (io - oxio) for occupation o in establishment i
- io = estimated total occupational wage in establishment i for occupation o
- xio = reported employment in establishment i for occupation o
- ho = mean of the qio quantities for occupation o in stratum h
- Θor = proportion of employment within interval r for occupation o
- xior = reported employment in establishment i within wage interval r for occupation o
- (σ2cr , σ2er , and σ2ωr ) Within wage interval r,
these are estimated using the NCS and, respectively, represent the variability of the wage value imputed to each worker, the variability of wages across establishments, and the
variability of wages within establishments.
Reliability of the estimates
Estimates developed from a sample will differ from the results of a census. An estimate based on a sample survey is subject to two types of errorósampling and nonsampling error. An estimate based on a
census is only subject to nonsampling error.
This type of error is attributable to several causes, such as errors in the sampling frame; an inability to obtain information for all establishments in the sample; differences in respondents'
interpretation of a survey question; an inability or unwillingness of the respondents to provide correct information; errors made in recording, coding, or processing the data; and errors made in imputing
values for missing data. Explicit measures of the effects of nonsampling error are not available.
When a sample, rather than an entire population, is surveyed, estimates differ from the true population values that they represent. This difference, or sampling error, occurs by chance, and its
variability is measured by the variance of the estimate or the standard error of the estimate (square root of the variance). The relative standard error is the ratio of the standard error to the
Estimates of the sampling error for occupational employment and mean wage rates are provided for all employment and mean wage estimates to allow data users to determine if those statistics are reliable
enough for their needs. Only a probability-based sample can be used to calculate estimates of sampling error. The formulas used to estimate OES variances are adaptations of formulas appropriate for the
survey design used.
The particular sample used in this survey is one of a large number of many possible samples of the same size that could have been selected using the same sample design. Sample estimates from a given
design are said to be unbiased when an average of the estimates from all possible samples yields the true population value. In this case, the sample estimate and its standard error can be used to construct
confidence intervals, or ranges of values that include the true population value with known probabilities. To illustrate, if the process of selecting a sample from the population were repeated many times,
if each sample were surveyed under essentially the same unbiased conditions, and if an estimate and a suitable estimate of its standard error were made from each sample, then:
- Approximately 68 percent of the intervals from one standard error below to one standard error above the estimate would include the true population value. This interval is called
a 68-percent confidence interval.
- Approximately 90 percent of the intervals from 1.6 standard errors below to 1.6 standard errors above the estimate would include the true population value. This interval is called
a 90-percent confidence interval.
- Approximately 95 percent of the intervals from 2 standard errors below to 2 standard errors above the estimate would include the true population value. This interval is called the
95-percent confidence interval.
- Almost all (99.7 percent) of the intervals from 3 standard errors below to 3 standard errors above the estimate would include the true population value.
For example, suppose that an estimated occupational employment total is 5,000, with an associated estimate of relative standard error of 2.0 percent. Based on these data, the standard error of the
estimate is 100 (2 percent of 5,000). To construct a 90-percent confidence interval, add and subtract 160 (1.6 times the standard error) from the estimate: (4,840, 5,160). Approximately 90 percent of
the intervals constructed in this manner will include the true occupational employment if survey methods are nearly unbiased.
Estimated standard errors should be taken to indicate the magnitude of sampling error only. They are not intended to measure nonsampling error, including any biases in the data. Particular care should
be exercised in the interpretation of small estimates or of small differences between estimates when the sampling error is relatively large or the magnitude of the bias is unknown.
Quality control measures
Several edit and quality control procedures are used to reduce nonsampling error. For example, completed survey questionnaires are checked for data consistency. Follow-up mailings and phone calls are
sent out to nonresponding establishments to improve the survey response rate.
The OES survey is a federal-state cooperative effort that enables states to conduct their own surveys. A major concern with a cooperative program such as OES is to accommodate the needs of BLS and other
federal agencies, as well as state-specific publication needs, with limited resources while simultaneously standardizing survey procedures across all 50 states, the District of Columbia, and the U.S.
territories. Controlling sources of nonsampling error in this decentralized environment can be difficult. One important computerized quality control tool used by the OES survey is the Survey Processing and
Management system. It was developed to provide a consistent and automated framework for survey processing and to reduce the workload for analysts at the state, regional, and national levels.
To ensure standardized sampling methods in all areas, the sample is drawn in the national office. Standardizing data-processing activities, such as validating the sampling frame, allocating and selecting
the sample, refining mailing addresses, addressing envelopes and mailers, editing and updating questionnaires, conducting electronic review, producing management reports, and calculating employment estimates,
have resulted in the overall standardization of the OES survey methodology. This has reduced the number of errors on the data files as well as the time needed to review them.
Other quality control measures used in the OES survey include:
- Follow-up mail and telephone solicitations of nonrespondents, especially critical or large nonrespondents
- Review of schedules to verify the accuracy and reasonableness of the reported data
- Adjustments for atypical reporting units on the data file
- Validation of the benchmark employment figures and of the benchmark factors
- Validation of the analytical tables of estimates at the NAICS 4/5/6 level
BLS has a strict confidentiality policy that ensures that the survey sample composition, lists of reporters, and names of respondents will be kept confidential. Additionally, the policy assures respondents
that published figures will not reveal the identity of any specific respondent and will not allow the data of any specific respondent to be imputed. The most relevant statute which governs BLS confidentiality
is the Confidential Information Protection and Statistical Efficiency Act (CIPSEA). Each published estimate is screened to ensure that it meets these confidentiality requirements. To further protect the
confidentiality of the data, the specific screening criteria are not listed in this publication. For additional information regarding confidentiality, please visit the BLS website at
OES data are available in several formats from the OES home page at www.bls/gov/oes. The OES database search tool (www.bls.gov/oes/#databases) allows
customers to create customized HTML or Excel tables using the most recent OES estimates. OES data are also published as HTML tables or can be downloaded as zipped XLS files at
www.bls.gov/oes/tables.htm. Included are cross-industry data for the United States as a whole, for individual U.S. states, and for metropolitan and nonmetropolitan areas,
along with U.S. industry-specific estimates by 2-, 3-, 4-, and some 5- and 6-digit NAICS levels. Available data elements include estimates of employment, hourly and annual mean wages, and hourly and
annual percentile wages by occupation, as well as relative standard errors (RSEs) for the employment and mean wages estimates.
When updated estimates become available, a BLS news release makes an announcement providing a summary of U.S. data. For additional information, contact the OES staff at (202) 691-6569 or send
e-mail to firstname.lastname@example.org.
For many years, the OES survey has been a major source of detailed occupational employment data by industry for the nation, states, and areas. This survey provides information for many data users,
including individuals and organizations engaged in planning vocational education programs, higher education programs, and employment and training programs. OES data also are used to prepare information
for career counseling, for job placement activities performed at state workforce agencies, and for personnel planning and market research conducted by private enterprises. OES data also are used by the
Department of Laborís Foreign Labor Certification (FLC) program, which sets the rate at which workers on work visas in the United States must be paid.
Last modified: April 18, 2013