(December 2000) Census Bureau director Kenneth Prewitt calls Census 2000 a "good" census but still expects it to miss counting millions of people. Since the Census Bureau wants its numbers to be "as accurate as possible for all uses of the data," it will fix the enumerated counts using statistical adjustment. Congress, still uncertain about adjustment, is also requiring the Census Bureau to release data that do not include any statistical corrections. So, by early 2001, when the first results from Census 2000 are released, there will be two sets of results from Census 2000.

But is the census is so badly broken that such a historic step is necessary? Here are some reasons why I think it is not.

Counting vs. Capturing

There are two approaches to measuring the undercount; unfortunately, neither is immune from errors.

The demographic approach calculates the undercount for the whole country. It compares the actual census count to the population count from the previous census and the demographic events — births, deaths, and migration — between the two censuses. The demographic approach suffers because illegal immigration distorts the number of migrants. In addition, estimates of the undercount by race and ethnicity are affected by inconsistent definitions used by different administrative records. And the demographic approach is difficult to use for local areas because internal migration is not measured.

The dual system approach, based on capture-recapture methods from wildlife biology, can measure the undercount for state and local populations. The dual system approach assumes that the population in an area can be estimated by first "capturing" (counting) people in the census, then recounting them using a special sample survey and identifying the fraction "recaptured" (counted by both census and survey).

For animals, the area population is equal to the census count inflated by the ratio of the recount to the recaptured. The method used to calculate dual system estimates for people is more complicated. For humans, the ratio of the recount to the recaptured is refined based on information about erroneous enumerations (duplicates), unmatchable persons in the census (omissions), and the rate of matching survey to census records.

The dual system approach to measuring the undercount has been criticized because record-matching is prone to errors, especially when people move or vary the spelling or completeness of their name from one study to another. The approach also has been criticized for comparing the census to a sample survey, because surveys have higher error rates than censuses. Furthermore, surveys tend to miss the same people who are missed by the census. In a recent Evaluation Review article, Kenneth Wachter and David A. Freedman estimate that 3 million people were missed both by the 1990 Census and by the post-enumeration survey used to estimate the undercount.

Adding Ghosts, Subtracting People

The desire to adjust census counts at the local level has led to a very complicated process for dual system estimates. Congress requires the Census Bureau to do a special post-enumeration survey, now called the Accuracy and Coverage Evaluation (ACE). Because conducting a random sample of housing units would be time-consuming and expensive, the ACE for the 2000 census will include a random sample of about 12,000 groups of housing units, called block clusters. Interviews of 25 to 30 housing units in each of these block clusters will yield a sample of about 300,000 housing units nationwide.

The ACE survey divides the population into subgroups, called strata, based on geography, race, Hispanic origin, housing tenure (owner/nonowner), age, and sex. Dual system estimates are then calculated for each strata, representing the "true" number of people in each subgroup. Next, coverage correction factors are estimated for nearly 400 of these subgroups, by comparing the estimate of the true population with the initial census count.

Finally, the census correction factors are applied to the census files. If the coverage correction factor for subgroup Hispanic black males is 1.02, then the undercount is 2 percent. Therefore, for every 100 people in that post-stratum, two people will be added.

Note the significance of this addition. The only evidence that these people exist is statistical. The additional two people are what the statistical procedure predicts were missed by the census, based on the characteristics of the place and the other people who live there. There is no other evidence that these people exist — no names, driver's licenses, birth certificates, or Social Security cards. They are ghosts.

The correction factors for some demographic groups are negative. For every 100 people in that post-stratum (such as older nonminority homeowners), people will be subtracted from the census count. So, people who participated in the census by completing a form or by speaking with an enumerator, will be removed from that count for statistical reasons, not because they moved away or died.

Reslicing the Pie

When then-Secretary of Commerce Robert Mosbacher decided not to adjust the 1990 Census, he stated: "We have to determine both whether the actual count is better and whether the shares of states and cities within the total population is better. The paradox is that in attempting to make the actual count more accurate by an adjustment, we might be making the shares less accurate. The shares are very important because they determine how many congressional seats each state gets, how political representation is allocated within states, and how large a 'slice of the pie' of federal funds goes to each city and state. Any upward adjustment of one share necessarily means a downward adjustment of another. Because there is a loser for every winner, we need solid ground to stand on in making any changes."

Statisticians disagree about how solid the ground is. Professors Kenneth Wachter and David Freedman have criticized the Census Bureau's adjustment procedure for using indirect ways to estimate the undercount, questionable models to impute match status between the Post-Enumeration Survey (PES) and the census, and inappropriate methods to smooth results. They are skeptical about the dual system estimator because errors in the PES may account for half of the undercount. Lastly, they criticize the method for ignoring variability in undercount rates from one geographic area to another.

Counting on the Constitution

If all goes as planned, the 2000 census will be the first decennial census to produce two sets of numbers. This historic step is not necessary. The census is not that badly broken, though the error is not easy to measure. Statistical adjustment is not some minor technical correction that is done using well-established methods. Rather, it is a major change in the way that the American public is enumerated. Since the census was established by the Constitution, the American public and its elected representatives deserve the chance to decide these issues.

Hallie Kintner, staff research scientist with the General Motors Research and Development Center in Warren, Mich., served on the 2000 Census Advisory Committee from 1991 to 1995.