Lies, D*mn Lies, and Statistics Canada II: Internet Privacy & Security

With Statistics Canada having been criticized in the news recently, it’s good to see some of the real applications that impact Canadian businesses and lives, such as the Canadian Internet Use Survey.  But I think practitioners–and the general public–still aren’t quite fulfilling “due diligence” in either citing the Statistics Canada information or in how they perceive and interpret it.  Even following Statistics Canada’s own perfectly-correct guidelines about whom the results do and do not represent or whether a significant correlation can or cannot imply causation, the data may still not be giving the answers we think they are.

Statistics Canada’s Canadian Internet Use Survey is often cited by public interest groups, not-for-profit organizations, and marketers to support all manner of opinions.  What I am mostly concerned about this time is the portion of it concerning Internet Privacy and Security concerns.

Although the mere five questions with only three possible levels of concern (None at all, Concerned, or Very concerned) may have been sufficient to determine that Privacy and Security is one of Canadians’ leading concerns, we know consider Privacy and Security a top concern.  Five questions with only three levels of concern is no longer responsibly-adequate to be meaningful.  (I am mostly-facetious when I propose that the number of Canadians actually concerned was severely overstated because anyone that wasn’t oblivious or reckless was considered at least “Concerned” in the first place).  Knowing how important Privacy and Security is, and knowing how often-cited those statistics are,  I think the Stats Can survey is doing a disservice to Canadians, their concerns, and the businesses that benefit from it.

For example, if people take the time to examine the actual survey questions pertaining to Privacy and Security http://www.statcan.gc.ca/imdb-bmdi/instrument/4432_Q1_V8-eng.htm#a10

Section: Privacy and security (PS)

PS_BEG
Beginning of Section

PS_R01
The next set of questions relate to privacy and security concerns on the Internet.

PS_Q01
In general, how concerned (are you/would you be) about privacy on the Internet? For example, people finding out what websites you have visited, others reading your e-mail?

Interviewer: Read categories to respondent.

  1. Not at all concerned
  2. Concerned
  3. Very concerned
    DK, RF

Coverage: All respondents

PS_Q02
How concerned (are you/would you be) about conducting banking transactions over the Internet?

Interviewer: Read categories to respondent.

  1. Not at all concerned
  2. Concerned
  3. Very concerned
    DK, RF

Coverage: All respondents

PS_Q03
How concerned (are you/would you be) about using your credit card over the Internet?

Interviewer: Read categories to respondent.

  1. Not at all concerned
  2. Concerned
  3. Very concerned
    DK, RF

Coverage: All respondents

PS_Q04
How concerned (are you/would you be) about providing personal financial information to government departments over the Internet? (e.g., applying for employment insurance or a student loan?)

Interviewer: Read categories to respondent.

  1. Not at all concerned
  2. Concerned
  3. Very concerned
    DK, RF

Coverage:  All respondents

PS_Q05
How concerned (are you/would you be) about giving personal, non financial information to a government official in Canada over the Internet?

Interviewer: Read categories to respondent.

  1. Not at all concerned
  2. Concerned
  3. Very concerned
    DK, RF

Coverage: All respondents

PS_END
End of Section

they will note that there are a total of five questions. Those who have taken statistics will recognize that the meaningful options of “Not at all concerned,” “Concerned,” and “Very concerned” imply ordinal data (there is a consistent directionality in the variables).

Those of you who have taken some survey and research design might be concerned, however, that the “centre” choice (sometimes questionnaire-designers purposely give an even number of choices to avoid a dead centre choice) does not at all imply middle of the road. In fact, if a respondent is not absolutely free of concern about privacy (ie. reckless), then any other choice will enumerate them amongst the concerned. There are many of us who have “appropriate” caution when we conduct business online (ie. would not describe ourselves as either apathetic or reckless) but are also would not consciously be concerned about privacy and security under normal conditions (ie. would not describe ourselves as neurotic or paranoid).

Vote for Robin in the 2010 CIRA Board Elections!

The 2008-2009 CIRA Annual Report demonstrates how significantly these data have impacted CIRA’s initiatives, ranging from DNSSEC to BIND10 to WHOIS privacy http://www.cira.ca/annual-reports/2009/en/c_dns_03_en.html. But the primary survey to be cited employs only five questions that will inherently bias responses towards overestimating the amount and degree of concern Canadians have because of its pecular scale.

Highly-qualified statisticians and researchers at Statistics Canada go to a lot of trouble trying fastidiously to apply accepted theory in questionnaire, survey, and sampling design according to traditional principles of maximizing face validity, content validity, criterion validity, Likert scale best practices, stratified random sampling, and making sure that the report reflects accurate interpretation under the correct circumstances in the proper contexts.

But used out of context or with varying lower degrees of external validity (generalizability), all that effort can be wasted–or worse, reinforce the popular notion that statistics are somehow worse than both lies and d*mn lies http://robincheung.info/mbalog/2010/07/21/lies-dmn-lies-and-statistics-statistics-is-actually-your-friend-when-not-misused/

This time, I’m not blaming people for using statistics out of context to support their arguments; I’m suggesting that Statistics Canada should amend the survey.

There is a mechanism for interested businesses, individuals, and Statistics Canada to understand each other and develop surveys that are more meaningful and accurate, by the way.  This October 26 to 29, 2010, Statistics Canada is hosting the 2010 International Methodology Symposium in Ottawa, ON.   If you can’t make it to that event, Statistics Canada maintains a web site about its training, conferences, and research events: http://www.statcan.gc.ca/services/workshop-atelier-eng.htm

How I CIRA the Internet in Canada

In order to make it to the final ballot, I still need to collect at least 20 votes of support! Please vote for me at the CIRA

Canadian Internet Registration Authority

election site! https://elections.cira.ca/2010/en/election.html

If you are a .ca domain-owner (or care enough about Internet policy, such as CIRA representation of public interests in meetings with the CRTC  and infrastructure in Canada, such as the implementation of DNSSEC, BIND10, and IPv6)

• The branding of dot-ca domains consistent throughout the CIRA site is clear:

  1. the .ca brand is associated with organizations Canadians trust;
  2. eligibility for .ca top-level domains is contingent on a legitimate Canadian connection, as defined by Canadian presence requirements;
  3. .ca domains convey comfort and confidence to form business relationships; many top-level domains can be registered by anyone, anywhere; interacting with a business emphasizing its .ca domain encourages business relationships; clarity that Canadian law applies to transactions promotes confidence to form relationships that view clear legal framework as a source of strength rather than restrictions to avoid.

I share Rick’s straightforward values that emphasize the integrity that a .ca top-level should be reflective of a true Canadian presence.

Porter's Five Forces model allows a structured, systematic evaluation of the differential strategic relationship between a firm and its competitive environment

CIRA asserts the .ca top-level domain as a “key public resource” consistently; however, technology remains a cat-and-mouse game where the reality is even if you’re smarter than one in a million people, the world is a big place, and assuming latest estimates of just under 2 billion global Internet users, out-thinking at least 2,000 people who are smarter than you most of the time–or nearly 70 people smarter than anyone in Canada all of the time–actually means risk from technological threats is easily-quantified and membership can reliably allocate resources appropriate to the the type and value of assets at risk.

Just as uncertain as we are about the precise nature of the next successful technological threat, we are equally certain about what amount of loss we were not only prepared to accept, but in fact built into the resource allocation decision.

What does have influence on businesses that associate with the .ca domain brand and CIRA not only has complete control over but the 2009 Annual Report recognizes that since CIRA’s sole source of operating revenues are domain registrations by its 154 registrars over which CIRA has no formal control and no clear strategic relationships.

This represents both a significant exposure due to complete dependence on registrars as a source of operating revenues, with which CIRA has no formal control or apparent strategic alignment; the corollary is that there remains complete freedom to define all aspects of future strategic relationships. Although CIRA promulgates a single consistent and coherent .ca positioning and claims moderate 60% preference to conduct business with .ca-branded domains; however, it is not clear whether this slight advantage is directly attributable to reputation earned by the large, stable, conservative organizations such as Canada Post, the many federal government departments operating under the .ca ccTLD and the even more numerous provincial government departments operating under the provincial second-level domains.

Baldwin Bicycle Case: an MBA Case Assigned World-Wide

This case is the number one result for the case on Google, and it earned me a grade of 99.5.  It has been assigned all over the world for years, as far away as University of Victoria, Philippines, and all around the US.  Although the case content is based on a managerial accounting issue, a solution cognizant of the strategic issues must be the ultimate one implemented:

 

Fence Company Management Accounting

Example of Strategic integration into an accounting issue:

 
 

The 2008-2009 Annual Report announced an aggressive education programme designed to support registrars in their sales function and identified small-to-medium-sized enterprises as a target segment.

The same infrastructure that translates to low incremental cost to service each additional .ca domain that each registration fee brings also means that CIRA’s commitment to “Five nines” DNS availability necessarily commits a significant incremental fixed cost for initiatives such as the migration to BIND10 regardless of whether CIRA DNS servers will remain authoritative for a given .ca domain.

Each time a user clicks a link in their browser in order to navigate to a web page (or attempts to initiate an FTP, SSH, telnet, IRC, MSN, or any other connection on the Internet), they initiate a series of close to 10 processes, all of which must complete successfully in order to fulfill the user’s request. Successful completion of each click, however, appears to the user as one action (click) and one result (new page displayed, if successful; status code, such as “404 Not Found” displayed otherwise).

Domain Name Server resolution of a .ca domain to a numeric IPv4 address is one of these steps. Although CIRA members, staff, and management are surely well-aware, as well as technically-inclined and -informed users know that resolution of a .ca domain for which CIRA DNS servers are authoritative, the majority of Internet users would view failures at any point in the sequence as failure of a single process. Although “five nines” high availability, security, and trustworthiness of the .ca domain brand may be intuitively-linked in our minds, current aggressive publicity about the number of DNS requests per minute and sophisticated steps to localized and mitigate security exposures to processes over which CIRA does exert direct control implies users can trust the entire sequence of events they initiate with each click. Although Statistics Canada (2005) highlighted the three-quarters of Canadians concerned about Internet security, this even more underscores the need to manage expectations of secure transactions.

Educating Canadians on the technical intricacies of why CIRA was not responsible for identity theft that occurred because the .ca domain’s hosting provider neglected to update a critical security component is neither a realistic solution nor does it explain why the much-touted DNSSEC, IPv6, BIND10, and other initiatives the Annual Report said would keep Canadians safe did nothing to prevent precisely what they feared. Cursory investigation of a convenience sample of registrars shows a disconnect between CIRA positioning of the .ca brand as a differentiated product with distinct advantages and inconsistent responses to what the real differences between .ca and other domains are. Perhaps the most undermining to CIRA’s branding–and even the importance of CIRA DNS services in the first place–are answers such as the following from

http://baremetal.com/domains/ca_faq.html#ca_diff

What is different about .ca domains?

Quite a few things, which most of which are due to CIRA. The biggest difference is that _everyone_ registering a .ca domain name has to visit the CIRA website to accept their (very long) registrant agreement. Many changes to domain registrations also require a visit to the CIRA website for confirmation. Basically anything which affects ownership of a domain has to be confirmed at the CIRA website. While it can be a nuisance, there are a number of positive benefits (for example registrar transfers are fast and simple when compared to the com/net/org world).

The most benficial educational programme should logically focus on nurturing strategic relationships with registrars before introducing the complexity of targeting small businesses. The small business segment, though it would doubtless appreciate the trust and brand equity builtith different business practices, ethical guidelines, and priorities.

Further segmentation studies and investigation of a new small-business segment with distinct positioning catering to the needs and CIRA should encourage strategic alignment and partnership with the registrars, highlighting the value of a more consistent vision for the .ca domain brand. Perhaps more importantly, though, it could serve to insulate the hard-earned trust in the current .ca brand likely established by major institutional organizations against the certain uncertainty and diversity of business practices that necessarily would result from specifically targeting and increasing the proportion of small and medium businesses comprising the dot-ca domainspace. Another registrar from the same convenience sample further undermines CIRA’s branding efforts in a way neither aligned with the Baremetal position, above, nor with CIRA’s .ca brand efforts.

Fidonet dog ascii

The Fidonet Mascot: Click for historical BBS Listings Google Results

Canreg.Com directly dismisses any accountability or perceived trust in the .ca brand as a “myth.” In order to make my own positioning clear, aware that I might have taken some of you by surprise, I believe I can make a unique contribution to evolution off the Internet in Canada. Other specialized candidates may have more training or experience than I in their respective specializations. But just as classical experimental research aims to isolate individual variables to study them without interference from confounding effects whereas factorial research designs and multivariate analysis acknowledges that some effects are evident only as the result of interactions between variables, I believe I bring to the table a cohesive vision of the Internet, uniquely informed by my understanding of it as the complex evolving union of sociological, technological, and commercial aspects of society. I contrast this with a popular model of the Internet as the application of technology to facilitate societal interaction driven by commercial investment that I feel has the same pigeon-holing effect that old-school marketing segmentation based ondemographics or psychographics had, trying to assign individuals to contrived categories rather than the current statistical methods that identify groups of customers that share buying behaviours and preferences but that may transcend attempts to label with a characteristic consistent across the segment.

RIM Inter@ctive 950 Two-way pager that I had before the Blackberry

In the same way that I saw the natural co-evolution of society with technology in the BBS world and immediately adopted wireless voice in 1991, wireless data in 1994, and RIM’s pre-blackberry two-way pager in 1998 not so much because I believed those technologies would become the dominant design but because their later widespread incorporation into society was contingent upon some user base supporting vision so they would survive and develop to what society eventually embraced. This passion for and belief in the Internet as part of society and not parallel with it, I believe, can inform policy and infrastructure that anticipates future challenges and roles rather than reacts to it.

Added September 10, 2010: The Show of Support has passed.  According to the site, I received 12 out of the 20 that I require; however, since they are not able to make the official announcement until September 13, it is possible that there are votes not counted in the online tally.  I’ll keep you all posted, but for now, I thought I’d share the response of the President, Byron Holland, to my strategic assessment:

Research Design 102 Redesigning a Better CIRA survey

Yvon, selon le commissariat aux langues officielles, ni CIRA ni les programmes fédéraux n'oublige qu'il ait besoin évident: http://www.ocol-clo.gc.ca/html/faq2_f.php#q4

The following post was actually primarily a response to "Canadian Public Interest in Internet Policy and Decision Making" sent by CIRA in October, 2009. If it were a one-off survey conceived by someone at CIRA whose responsibility never before included surveys or questionnaire design, I could overlook the survey as a meaningless make-work project; however, the intent to find something out does seem genuine.

And something as fundamental as the apparent intent of the survey to identify what issues concerned CIRA most and the apparently desire to understand more about these important issues from the consistent use of open-ended questions seems worth, if not hiring a marketing research consultant to design and execute the research, any researchers on the Board might be able to improve survey questions and internal validity, even if their specialization was not at all a social science.

As a Canadian actively online since the late 80s I chose to participate in this year's CIRA board election because of my keen desire to make meaningful contributions that may not be voiced on their own or informed by my holistic understanding of the social, technological, and commercial factors that sometimes supports outcomes not anticipated by when considering them individually.

Although I am confident the board would comprise individuals with stronger competencies than me in isolation, it was this unique understanding of the factors in combination that led me to launch a public online service to provide rudimentary international file- and echo-dissemination services using pre-Internet technologies to what was as clearly an eventuality as wireless data when I first adopted it in 1994. Social media stands poised to change the site-centric paradigm that even predated the Web in Gopher and even Archie extended but could not transcend. In much the same way, Mendeley is within reach to apply social media to change the rules of scientific research from one that reinforces scholarliness over prestige when it presents functionality that actually facilitates researchers' workflows and by design removes the practical limitation of knowing what every researcher in the world may have considered relevant to your own research that has legitimized Impact Factor (how often a journal is cited) as an indicator of quality of research.

I emphasize the role social media will play in the parallel evolution of the Internet and research theory and design because it was the qualitative survey instrument featured above that at once caught my attention and concerned me. Having experience in both applied (marketing research) and scientific (ethnography, phenomenology, typography, and others) qualititative research–even qualitative research designed to inform subsequent quantitative research (sequential exploratory mixed methods). But I believe strongly that the research questions the survey attempts to investigate, along with the questions themselves were representative of the poor understanding even post-doctoral researchers often have of the nascent discipline of qualitative research.

The biggest concern I have about the survey–and I expect any academic institution's Internal Review Board that must approve any research that involves human subjects–is that it both unloads the researcher's lack of clear research direction onto the respondents by expecting them to compensate for an arbitrary "fishing expedition" research design with no hope to probe any specific concerns (phenomenology) or give any meaningful insight into attitudes, concerns, behaviours, or perceptions CIRA members have (ethnography).

Although concept mapping software designed to facilitate coding and interpretation of open-ended qualitative lines of questioning continue to evolve along with qualitative research as a discipline, it cannot turn a poor research design into a good one. Qualitative research is not merely the incorporation of non-numerical data into other-wise quantitative research projects. Qualitative research is appropriate to answer entirely-different research questions with entirely different objectives to quantitative. Open-ended questioning allowing respondents to answer using whatever words they feel appropriate, with as much detail as they please, both allows researchers to adjust questions dynamically and probe interesting responses.

Thus, qualitative research does not aim to determine whether a theory governs a behaviour or phenomenon, which is the domain of quantitative research of various designs that test specific hypotheses that a theory predicts using deductive reasoning. Instead, when it is intended to inform theory construction, it is generally the abstraction of a theory from observations via inductive reasoning (such as Grounded Theory)–the precise opposite of quantitative research.

Qualitative research that does not aim to abstract a theory from observations, such as ethnographic or phenomenologic research, is not at all interested in answering the question of whether a behaviour or phenomenon is representative of anything at all, but rather simply to explore the behaviour or phenomenon.

One of Walden University's strengths, however, is its unique presentation of research theory and design as a logical workflow to provide context to select not only *an* appropriate research design for a given research question, but *the* most appropriate design for *the* most appropriate research question to ask.

This necessarily means beginning every inquiry considering the epistemological and ontological foundations of the research in the first place. As I point out in a LinkedIn discussion response to Rick Anderson's two points asserting the importance of Canadian presence to .ca domain eligibility (in reality, one point and one rationale supporting it), although technological minds are well-prepared to come up with all sorts of innovative mousetraps: spring-loaded ones, biodegradable ones, fashionable ones, low-cost ones, decorative ones, ultrasonic ones, chemical ones–and many more than I would conceive.

But sometimes going back to defining the real underlying question might change the research question from "What is the best mousetrap for our strategic positioning and target market?" entirely to "Is there an easily-repaired hole allowing mice into the house?"

Without indulging the scenario, the first research question can be an extremely involved one, beginning with marketing research to characterize the target market segment that is characterized by shared buying preference but may transcend traditional demographic or psychographic categories. These characteristics can be reduced to a smaller set of more meaningful attributes using principal components analysis and then fed into a conjoint analysis model that would build "the perfect mousetrap" from the ground up with the most desirable combination of attributes identified by the marketing research and validated with expensive focus groups before investing even more money on a prototype and market testing.

As presented, the CIRA survey attempts to determine what issues CIRA members consider important (a research question appropriate for a quantitative survey, such as rating or ranking on a set Likert scale) but allowing respondents the possibility to disqualify their responses with unclear or non-applicable responses and maximizing the validating, coding, and interpreting workload required to yield only limited insight–future qualitative research topics to probe, at best. Because I know that not everyone is interested in knowing any more than that [in their entire lives] about research design, but I have committed myself to the continued effort to help professionals and practitioners understand that many academic theories were derived from real-life stock prices and sometimes "too theoretical" is an excuse to avoid thinking; to show academics the real-life application and context for the theories they work hard to to generalize; and for the general public who feels neither academics nor executives consider them important enough to take a moment and explain anything not to think corporations are only set on exploiting them (without doing it by giving them what they want) or that academics purposely make theory incomprehensible to keep it from the masses, when in fact, the reason theory construction and scientific inquiry works the way they do, I believe would become clear to anyone who invests the time and effort to develop more structured, disciplined reasoning.