2 Security, Privacy, Ethics, and Equity

This week we learn what is data security, privacy, ethics, and equity to us and society within the data life cycle.

2.1 How well do you know about basic corporate and governmental internet practices and policies

2.1.1 Test your knowledge

Class Activity 1

Complete this 17 true-or-false question quiz created by researchers from the Annenberg School for Communication at the University of Pennsylvania (Turow et al. 2023).

You will have 6 minutes to complete it (it should take you roughly 3 to 4 minutes).

ANSWERS BEYOND THIS POINT!

Answers to each question are beyond this point along with additional information. Please do not proceed until you have completed the quiz!

2.1.2 Answers to the quiz and discussion

Class Activity 2

We will review the answers to the quiz along with additional information that helps answer those questions.

Did the answer surprise you? Why or why not?
What assumptions did you have prior to knowing the answer?
What answer would you prefer? What steps would be necessary for the answer to change? For example, if the answer is “False,” would you rather the answer be “True”? If so, what would it take to make the statement true?

When I go to a web site, it can collect information about my online behaviors even if I don’t register using my name or email address.

TRUE

From Turow et al. (2023),

Companies have an ability to see what we do on our websites and apps (through first-party cookies and other such trackers); to follow us across the media content we visit (via third-party cookies and emerging versions); to view our activities as we move from one media technology to another— for example, from the web to our smartphone to our tablet to our “connected” TV to the in-store trackers we pass in the aisles, to outdoor message boards we stop to view. Whether with first- party data, third-party data, or more, the goal is to give us tags or personas and have computers decide whether and how we ought to be the companies’ targets.

A Smart TV can help advertisers send an ad to a viewer’s smartphone based on the show they are watching.

TRUE

See answer to question 1.

A company can tell that I have opened its email even if I don’t click on any links.

TRUE

Email tracking is a method that sends a signal to the sender’s server that the email has been opened. Many services use this method (the Urban Institute does this for their newsletters!). However, some email providers, such as Outlook and Gmail, have features that can block this type of tracking. You should check your email settings.

A website cannot track my activity across devices unless I log into the same account on those devices.

FALSE

See the answers to the previous questions. Cookies, IP address tracking, email tracking, and more can be linked together to track an individual across devices (although the accuracy might be reduced, such as being coarsened to the household level).

When a website has a privacy policy, it means the site will not share my information with other websites or companies without my permission.

FALSE

A privacy policy does not mean a site won’t share a person’s information with other sites without the person’s permission.

Chrome’s New Privacy Policy

Facebook’s user privacy settings allow me to limit the information about me that Facebook shares with advertisers.

TRUE

Check all the apps you use to adjust your privacy settings!

Figure 2.2: A screenshot of updating one’s ad preferences for their Facebook account.

All fifty states have laws requiring companies to notify individuals of security breaches involving personally identifiable information.

TRUE

There was a recent Federal Register Notice (FRN)¹ on “Data Breach Reporting Requirements.”

Figure 2.3: A screenshot of the “Data Breach Reporting Requirements” Federal Register Notice.

It is illegal for internet marketers to record my computer’s IP address.

FALSE

There is no federal law in the United States that protects consumer data privacy.

It is legal for an online store to charge people different prices depending on where they are located.

TRUE

Think about your local area and whether grocery stores change the prices of their goods based on their proximity to certain neighborhoods.

The doorbell company Ring has a policy of not sharing recordings with law enforcement without the homeowner’s permission.

TRUE

Check out this Associate Press article about the new policy!

Figure 2.4: Screenshot from the Associate Press article.

By law, a travel site such as Expedia or Orbitz that compares prices on different airlines must include the lowest airline prices.

FALSE

There are no United States laws mandating that such travel sites offer the lowest airline prices.

In the United States, the federal government regulates the types of digital information companies collect about individuals.

FALSE

Again…THERE IS NO FEDERAL CONSUMER DATA PRIVACY LAW!

Some large American cities have banned the use of facial recognition technology by law enforcement.

TRUE

San Francisco became the first United States city to ban facial recognition.

BBC article

NYTimes article

The US Federal government requires that companies ask internet users to opt-in to being tracked.

FALSE

For the third time, THERE ARE NO FEDERAL DATA PRIVACY LAWS!

In the United States, what you encounter on sites that prompt you to choose are usually ‘opt-out’ settings. If there’s an ‘opt-in’ alternative, it’s often because website designers have streamlined the site, not wanting to differentiate visitors from the United States or European Union countries, which have ‘opt-out’ data privacy laws

Section 230 of the Communication Decency Act ensures that digital platforms like Facebook, Twitter, and YouTube can be held responsible for illegal content posted on their platforms.

FALSE

From Section 230 of the Communication Decency Act:

No provider or user of an interactive computer service shall be treated as the publisher or speaker of any information provided by another information content provider.

The Health Insurance Portability and Accountability Act (HIPAA) prevents apps that provide information about health from selling data collected about app users to marketers.

FALSE

From Turow et al. (2023),

The Federal Health Insurance and Portability Act (HIPAA) does not stop apps that provide information about health – such as exercise and fertility apps – from selling data collected about the app users to marketers. HIPAA came into law in 1996 to “improve portability and continuity of health insurance coverage in the group and individual markets, to combat waste, fraud, and abuse in health insurance and health care delivery, to promote the use of medical savings accounts, to improve access to long-term care services and coverage, to simplify the administration of health insurance, and for other purposes.” Therefore, HIPAA does not prevent apps that provide information about health from selling data collected about app users to marketers.

Some social media platforms activate users’ smartphone speakers to listen to conversations and identify their interests in order to sell them ads.

FALSE

There has never been a documented case of a social media platform activating users’ smartphone speakers to eavesdrop on conversations and identify their interests for targeted advertising. Your browsing history already furnishes enough information for that!

2.2 What is data security, privacy, ethics, and equity?

For this week, you were assigned to read the first two chapters of the textbook, watched “Coded Bias,” researched a real-world example of data being used unethically and/or violated people’s privacy.

Class Activity 3

Drawing from the readings, documentaries, and your assignment, let’s delve into the following:

What data did AI, algorithms, or the example you found use to determine what you see?
Does the scenario you mentioned apply to everyone or only to a specific subgroup of individuals? Why or why not?
Could the scenario you brought up be used for a public good? If so, what changes would be necessary to create a public good?
Does those changes involve security, privacy, ethics, and equity?

Important

Did you notice that the AI narrator in “Coded Bias” speaks with a female voice.

Consider other AI interfaces we interact with daily. Do they also predominantly feature a female voice? What biases might this perpetuate within society?

Scarlett Johansson says she is ‘shocked, angered’ over new ChatGPT voice - NPR Article

Figure 2.5: Screenshot of the NPR article.

2.2.1 Defining key stakeholders in the data ecosystem

For this course, this is how we define the various stakeholders in the data ecosystem.

Figure 2.6: Key stakeholders in the data ecosystem

2.2.2 Security, Privacy, and Confidentiality

Data Security

Data Security is the “…science of methods of protecting computer data and communication systems that apply various types of controls such as cryptography, access control, information flow paths and inference control, including backup and recover” (Denning 1982).

Data security often refers to the hardware, software, storage devices, and user devices; access and administrative controls; and organizations’ policies and procedures. For example, many organizations have switched to two factor authentication and VPN to access their systems.

Data Privacy

Data Privacy is the ability “to determine what information about ourselves we will share with others” (Fellegi 1972). Data privacy is a broad topic, which includes data security, encryption, access to data, etc. We will not be covering privacy breaches from unauthorized access to a database (e.g., hackers).

Confidentiality

Confidentiality is “the agreement, explicit or implicit, between data subject and data collector regarding the extent to which access by others to personal information is allowed” (Fienberg and Jin 2018).

There are at least three major threats to data security and privacy.
1. Hackers: adversaries who steal confidential information through unauthorized access.
2. Snoopers: adversaries who reconstruct confidential information from data releases.
3. Hoarders: stewards who collect data but don’t release the data even if respondents want the information releasesd.
There are differing notions of what should and shouldn’t be private, which may include being able to opt out of or opt into disclosure protections.
Data privacy is a broad topic, which includes data security, encryption, access to data, etc. We will not be covering privacy breaches from unauthorized access to a database (e.g., hackers).

2.2.3 Ethics and Equity

Ethics

Data ethics is the “…systemizing, defending, and recommending concepts of right and wrong conduct in relation to data, in particular personal data” (Kitchin 2014).

Most research institutions and government entities have an Institutional Review Board (IRB), an institution that applies research ethics by reviewing the methods proposed for research involving human subjects, to ensure that the projects are ethical. IRBs do not evaluate the quality of the research. They evaluate the the ethical protection of human subjects.

Equity

Data Equity is …

representation
access
process
outcomes
and more

Data equity is also a complex concept, which should be thought of throughout the data life cycle. How do we ensure people are properly represented in data or various organizations in the United States have access to that data to better communities?

Class Activity 4

Prior to starting this course, how would you have defined data security, privacy, ethics, and equity?
How has your definition of these different terms changed since the assignments and these discussions? Why or why not?

2.3 Data life cycle

We’ve just started exploring the world of security, privacy, ethics, and equity. Throughout this course, we’ll integrate these definitions, concepts, and ideas into our discussions, applying them to the data life cycle, which we define as:

data collection or acquisition (week 2)
data storage (week 3)
data sharing and transfer (week 3)
data analysis (week 4)
data dissemination (week 5)
data destruction or termination (week 6)

No set taxonomony in the field!

All definitions used (including the data life cycle) are my opinionated definitions. Since many different fields work in data security, privacy, ethics, and equity, there is no standard taxonomy, which causes a lot of confusion. I set a standard definition in all my work, including this course, to ensure we are using the same common language. However, note that when reading other materials or literature, you might encounter conflicting terminology.

FRN is for proposed rule-makings and updates, proposed settlements, public meetings and workshops, and other important agency activities.↩︎