1. What are the ethical issues identified in “How big data will haunt you forever: your high school transcript? How might they best be addressed?”

2. Find and explain one other case of ethical issues arising from big data. What’s the case and what’s the ethical issue? (Pls give sources and links.)

500 words, due by noon Wednesday 19th here or on the forum.


  1. Marc Stahl says:

    Be Prepared To Wave Your Rights if You Want A Job

    The article Big Data describes how the ever increasing capacity to store and retain electronic information poses lifelong challenges for students. In the past information regarding the scholastic performance, academic achievements (and failures), and even participation in school events was kept in paper form. The administration and retention costs of this paper based data created a natural lifecycle for the data: it began and ended with a student’s academic career (and often more specifically with the academic career at a particular institution). Today that same information transcends institutions and even time, forming a permanent recording which can be carried forward indefinitely. If this data remained private and under control of the individual student this would not be a concern; however, the technical ability to correlate this data with predicted performance and success in school, careers, vocations, relationships, etc. poses a new threat. Students may find their futures limited by a narrow view of their past performance, without consideration for extenuating circumstances, context, individual growth potential, or even the bias of instructors/institutions they attended.

    In searching for examples of ethical issues facing the use of big data I encountered an even larger question: just because the outcome of using big data may be undesirable or unpleasant for an individual, is it unethical? In the article above students may not like predications made by big data systems, but if they are right only 50% of the time is their use as the foundation for decision making really unethical (even though obviously unpleasant)?

    One example I found was a Wall Street Journal article ( that discovered one group of users of the Orbitz travel booking site paid on average 30% more for hotel rooms than another group of users of that same site. The reason for the difference was due to big data analysis showing Mac users were less price sensitive than Windows users – so Mac users were steered towards pricier hotels. On the surface the article fosters outrage at the suggestions that some users are squeezed for more money than other users just because of the computer they use – clearly unethical. However, a deeper look suggests that Orbitz is simply giving users what they want; and if Mac users often want pricier hotels why not give it to them? From that perspective what Orbitz does was not just ethical, but actually meeting the needs of the consumer.

    The ethical questions surrounding the use of big data come down to the balance between the rights of individuals (to protect their privacy and freedoms) and the rights of corporations (to maximize shareholder value). One can argue that corporations don’t owe anyone a job, and that individuals must sacrifice their right to privacy (including academic history, career history, criminal history, etc.) if they want the privilege of a job. And this is most likely the direction society is heading.

  2. Megan Craig says:

    The massive collection, storage, and processing of personal data that we have termed “big data” has opened up many possibilities in analysis of activities and groups of people. In the excerpt by Mayer-Schonberger and Cukier (2014), they describe the use of big data in an educational context to quantify learning and teaching and improve student and instructor performance. They address ethical issues of privacy and individual identity. Whereas typical data privacy issues are protecting access to personal information, with big data the issue is related to the permanence of the data. Mayer-Schonberger and Cukier suggest that this denies the possibility that humans can evolve and change with age, and that others may evaluate the individual unintentionally based on this information. Additionally, because it will be a comprehensive, accurate program, admissions officers and job recruiters will place higher influence on the information from it, which again strongly links the student to their past. Identity issues arise from the tracking and predictions that big data will support. The programs are designed to individualize learning for the most promising results, providing exact predictions of the work needed, materials, and teaching. The purpose is to allow for the highest probability of success, so what if it subtly constrains the individual to the easiest pathway, rather than the pathway of their choosing? So although education will be tailored to the individual and expected to be more successful, it may reduce learning freedom and opportunities for individuals to choose their track.

    To address these issues, Mayer-Schonberger and Cukier suggest that privacy laws requiring informed consent of personal data would not be a viable option because big data often becomes useful for purposes much later than the initial collection time, therefore informed consent is impossible. Potentially, policies could make data processors more accountable for the misuse of big data, adding legal liabilities. They suggest transparency and regulatory oversight, with a tough enforcement to balance the educational benefits with the privacy and identity issues.

    Another case of big data usage that has led to ethical issues is the detection of disease outbreaks using Internet-based surveillance systems. These programs collect information regarding epidemics based on news media reports, search engine keywords, and location information. They can be used to predict diseases with similar precision to traditional surveillance from physicians and health departments (White, 2015). It is highly useful for quickly and geographically detecting epidemics, but it has ethical issues related to privacy of big data and public health information. Traditional privacy questions of what data is acceptable and who should have access to it, as well as informing the individual arise with this Internet detection. More importantly, the public health issue is regarding whether the methods will work. Flaws to the system could have widespread detrimental consequences – under-prediction could lead to lack of preparedness, whereas over-prediction could lead to panic, misallocation of resources, and stigmatization of groups in the population (White, 2015). So although it may be highly useful in tracking health of the population, practicality may be limited by information overload, false reports, and sensitivity to media interest (Brownstein, Freifeld, & Madoff, 2009).

    Brownstein, J. S., Freifeld, C. C., & Madoff, L. C. (2009). Digital disease detection – harnessing the web for public health surveillance. The New England Journal of Medicine, 360(21), 2153-2157. Retrieved from

    White, M. (2015, February 24). The ethical risks of detecting disease outbreaks with big data. Pacific Standard. Retrieved from

  3. Philip Thingbø Mlonyeni says:


    There are specifically two ethical issues that are raised in the article in the context of big data stored about high school students. The first concerns big data and student privacy, the second concerning big data and how it is to be used in decision-making in schools. The privacy concerns that are raised by big data is due to the fact that it stores all data it can about students and can give us information about a student in a matter of seconds when it in the past would have taken days or weeks. The student will have a permanent record that employers might want to see, that might in some cases prove to be a liability for the student if she has made some blemishes in the past. In the old days it might’ve gone unnoticed, but now big data will “remember” it forever.
    The ethical issue concerning big data and decision-making is in my opinion more severe than the privacy-issue, mainly because it is more invasive in nature. Statistics have been used for a long time in decision-making, but has been limited because of limited data. One danger with big data is the illusion of knowledge, the illusion of knowing what’s best for others, when there are severe epistemic problems intrinsic to big data, especially in such cases as academic decision-making. As pointed out in the article, “Customized education may actually lock in these streams more ruthlessly, making it harder for one to break out of a particular track if they wanted to or could”. In other words, customized education might lead to a lack of individual autonomy, where one’s range of academic choices are limited by what the data tells is the statistically good choices.
    I think an important actor in such debates will be the students themselves and the body’s that represent them, like Student Councils. What data should be stored should be preceded by a dialogue between the ones whose data is to be stored, and those who are going to store it. This dialogue should be on-going, and it is important for the ones whose data are stored that they have a representative that can oversee the storage of data and report on it. Other actors will be needed as well, but in the specific case of storing student information I think this will be a key relation.


    I think the not-so recent case of Facebook doing an experiment on its users is a good example of a key ethical issue resulting from by big data ( Facebook justified it by claiming it to be “in the pursuit of science”, but clearly there are some ethical issues here. One of them is the lack of consent by Facebook’s users, leading to questions of whether it was an experiment or something more akin to social manipulation. This is related to the problem in the above case of academic decision-making, but shows the problems on a grander scale, and the huge implications of messing with big data.

