An innovative health clinic for new immigrants sees few patients trickle through its doors. Community health advisers scratch their heads. A state-of-the-art new Roman Catholic school is built in the wrong neighbourhood and shutters its doors. The local school board faces ratepayer anger over the misspent money.
These are some of the more dire potential consequences for Canadian communities if information gleaned from the controversial voluntary alternative to the long-form census paints the inaccurate portrait of Canada that experts fear.
Economists and statisticians are skeptical about the accuracy and usefulness of data that begin to trickle in this week from the inaugural 2011 National Household Survey. The Conservative government decided in 2010 that the survey would replace the mandatory long-form census, despite their acknowledgement that the decision was made without consultation, an ensuing outcry over the hasty move, and warnings it would jeopardize the quality of Canadian information.
The first report using the new voluntary data collection method — relating to aboriginals, immigrants and ethnodiversity — will be released on Wednesday.
Experts who have for years relied on census data for an updated look at changes in such groups in Canadian society now question how useful the new information will be.
The worst-case scenario is that all levels of government, as well as non-profit and private sector groups, will make decisions about community planning based on the wrong information, said David Bellhouse, a statistics professor at the University of Western Ontario.
“The tragedy of it is, the government was warned that this would happen," he said.
Prime Minister Stephen Harper's decision to cancel the long-form census over privacy concerns met with a wave of backlash from groups, including opposition parties, community organizations, professional associations, economists and government analysts. Former chief statistician Munir Sheikh resigned in protest over the death of the mandatory census, delivering a definitive message that a voluntary survey cannot replace a mandatory census.
Despite the best efforts of Statistics Canada analysts to mitigate problems inherent in voluntary surveys, the agency has issued strong warnings about the validity of the NHS data.
"We have never previously conducted a survey on the scale of the voluntary National Household Survey, nor are we aware of any other country that has. The new methodology has been introduced relatively rapidly with limited testing," it cautions on its website.
"We are confident that the National Household Survey will produce usable and useful data that will meet the needs of many users. It will not, however, provide a level of quality that would have been achieved through a mandatory long-form census."
Story continues below slideshow
National Household Survey Highlights
Experts warn that results from the survey will contain biases, that the new methodology renders it incompatible with previous censuses and that those biases will muddy other Statistics Canada surveys, which are used as the basis for policy planning.
Economists and statisticians do not buy the explanation that privacy-related complaints were behind the government’s decision to kill the long-form census — census data are protected by privacy legislation. Some believe the Tories eliminated the long-form census for purely political reasons.
Critics say the move to the NHS could allow the government to justify a reallocation of money away from programs for members of under-represented groups and that the data could be more easily manipulated by people with particular agendas. Others worry that the cost-conscious government could cite the problematic quality of data from the survey as a reason to kill any attempt to collect data on such a wide scale.
“It is a terrible disaster that for political reasons we lost a wonderful tool, which really (placed) Canada at the forefront of analysis," said economist Paul Jacobson of Jacobson Consulting Inc. “We have not gained a replacement that can be as good, period. We have gained something that will be as good as it can be, I hope.”
The long-form census, which was sent to one in five Canadian households and produced a 94 per cent response rate, was said to produce a non-biased sample of the Canadian population and was one of the most important planning tools in Canada.
Statistics Canada, economists and statisticians say the biggest issue with the new survey is a problem inherent in voluntary methods of data collection called the "non-response bias," which holds that marginalized groups — the very groups most in need of services — are the least likely to volunteer information, which means their status is under-represented.
Those groups include the poor, immigrants, aboriginals, the less educated and mobile students.
That problem, experts say, could skew data on everything from Canada's religious composition to income levels, which could hurt planning for the country's future.
The government has tried to adjust for the anticipated lack of response by sending the NHS to one in three households, a 65 per cent increase over the long-form census.
Experts say, however, a larger sample size does not solve the non-response bias because the same groups of people are still less likely to respond even in a bigger pool of people.
The response rate to the NHS was 68.6 per cent, a rate that, though larger than what StatsCan had prepared for, is not an acceptable basis for comparisons, Bellhouse said. In some communities, he said, response rates were as low as 25 per cent, and a few even had a response rate of zero.
"When you get a response rate that is that low, especially from a government survey, there could be quite a difference between the people that have responded and those that haven’t," he said.
Because of the minimal response in some areas, the data are basically useless at the community level, Bellhouse said. The lack of detailed information makes it difficult for communities to plan bus routes or for advertisers to target their marketing campaigns.
“Those data are completely unreliable in terms of any kind of planning purpose for people wanting to use the data for planning about their community," Bellhouse said.
Kevin Milligan, an economics professor at the University of British Columbia, is particularly worried about low-income Canadians being under-represented because they are less likely to report their incomes. Some economists believe high-income Canadians are also less likely to report their earnings truthfully in a voluntary survey. That could make Canada's income equality picture appear rosier than it is because all data obtained will be skewed toward the middle class, Milligan said.
“The big issue here is the overall distribution of income, any changes that we see — and there have been big changes over the past 30 years — we don’t know when we look at the NHS if what we're seeing is a result of changes in the sampling procedure or changes in the underlying distribution of income,” Milligan said.
Frances Woolley, a professor of economics at Carleton University, recently wrote a blog asking how economists should approach the data and whether they should use it at all.
Using religion as an example of the challenges to validity in voluntary information, she pointed to potential problems in the first set of data being released — covering groups such as aboriginals and immigrants, who are among those believed to be least likely to respond.
“Religion’s a subjective thing," she said, adding that that makes it difficult to adjust the results based on information from previous censuses. Many of the changes are coming from new immigrants or young people, who are less likely to respond.
There is also a strong correlation between religion and willingness to fill out a survey, as some religions encourage volunteerism and civic engagement, while others breed introverts who exclude themselves from non-religious society.
“How are you going to adjust for the fact that religious people are going to be more or less likely to fill out the survey?" she asked. “We won’t know if it’s a real trend or if it’s a change due to the different method of data collection.”
The survey data, along with the inherent non-response bias, will be useful in some regards, but not in the way economists find most interesting — comparing current results with the past to discover trends, Jacobson said.
StatsCan warns that there is a real risk that the different methodology will affect the comparability of the NHS data.
That has some economists, such as Jacobson, on edge.
“We’re not going to be able to tell whether they’re different, and telling whether things are different is what matters to an economist,” he said.
The long-form census, which was considered an accurate representative sample of Canada, was used as an anchor or the control to reduce the risk of bias in other StatsCan surveys, such as the closely watched Labour Force Survey.
The unreliable NHS data are still likely to be used as an anchor because the survey is the best tool available, but that could exaggerate, rather than reduce, biases in other surveys.
Statisticians use something called weighting in survey data to make the result more representative of the entire population.
For instance, if fewer low income people respond, their answers are weighted as if they were two people to even out responses. The long-form census, because of its depth and completeness, used to be the population benchmark for all other surveys. The NHS is a survey that will also have to be weighted, destabilizing its validity as that benchmark.
“The reason we knew what the whole population looks like is because we had the census that was the anchor of the whole system. Now the problem is we don't have that anchor to compare all these voluntary surveys to, and they become much less useful if we don’t have that anchor anymore,” Milligan said.
"The fact that there’s going to be survey weights (in the NHS) is an admission that this new survey is not representative, because, if it was representative in itself, you don't have to have survey weights to undo the bias."
Experts, who say they do not doubt the expertise and competence of Statistics Canada, await a clearer picture on the caveats and methodology used, as well as the level of confidence the agency has in the data, when the first results are released.
StatsCan analysts face a monumental challenge in trying sift through what it has and use its methods to make results as accurate as it can, Jacobson said.
"How StatsCan presents this information will define their reputation."
Canada had been consistently ranked at the forefront of information gatherers in the world, experts say, but many question how that reputation can continue after it presents the watered-down NHS data.
CUPE economist Toby Sanger said there will always be an element of distrust with the NHS data and questions the government's motives, especially given that is also slashing staff and resources at the agency. The end of accurate data contributes to an erosion of evidence-based public policy in this country, he said.
“In a broader sense, it’s sad for Statistics Canada. Statistics Canada was considered one of the pre-eminent, one of the best statistical agencies in the world. I’m not sure it would be considered in that place anymore.”
Some information crunchers are still in awe of the government's hasty decision and see it as yet another way the Tories are working to stifle scientific research and access to information.
"[The government] ignored scientific evidence and muzzled a lot of scientists, and in [getting rid of the long-form census] they've degraded one of the most important institutions that we have in terms of providing solid and objective information," Sanger said.
There is nothing that kills social science research more quickly than lack of good data, Woolley said. “If you want to kill social science research, the best thing to do is to destroy the data.”
Bellhouse, admittedly cynical, said it appears the Conservatives do not want the public to have the information that they would need to question the government’s policy decisions, and he believes the lack of quality data from the NHS would be a convenient excuse to kill the Canada-wide survey altogether.
“If you don't have the information, then they can do whatever they want, because no one has correct information.”