Transparency and Explainability Risks
Last updated
Last updated
Emerging technology has the potential to solve big societal problems in new ways, but the solutions it provides are often more opaque and more difficult to understand than the old ones. This section explores the risks that emerging technologies pose to transparency and explainability, and how these risks can be identified and mitigated.
Transparency is a way of operating that makes it easy for others to see what actions are performed. As a metaphor, transparency indicates two things: seeing something clearly and understanding something. But one does not always go with the other: we might clearly see something, but not understand it. Transparency is more than just providing information.
Transparency enables people and institutions to justify their actions. Justifications are important because they provide a basis for legitimacy. Justifications explain the reasons for acting in a certain way. Moreover, transparency enables people to change things. Knowing what went wrong for what reason enables us to intervene: We can hold people accountable, request changes, and ask for redress. Transparency is especially important when decisions are made that concern the public (e.g., decisions on taxation), when decisions might cause physical or mental harm (e.g., decisions on healthcare), or otherwise affect people's rights.
A system is explainable to someone if the person is in a position to understand which inputs led the system to produce certain outputs of interest. This idea is also sometimes called interpretability. The two terms are sometimes distinguished, but we can use them interchangeably for our purposes. In fact, the ideas of explainability or explicability, interpretability, and auditability are closely related in the context of transparency.
If a system is explainable, this means that it's possible to explain its decisions in a way that is satisfying to people wishing to understand it. For instance, a citizen might ask a municipality worker why a building permit was declined. The answer will in that case involve something like “because the proposed building is taller than the threshold specified by the building regulation, and so would have violated that regulation".
Providing such an explanation enables understanding in communication: The citizen now understands why the permit was declined and what they would need to do for it to be approved. Note that the explainability of a system is always relative to a certain person with particular need. A lawyer may not be satisfied with the explanation given to the citizen for lack of concrete references to the law in question, whereas couching the explanation in legalese may have made it incomprehensible to the citizen.
Transparency has become an increasingly salient issue, largely because emerging technologies continue to render certain aspects of society increasingly complex. They add to the complexity issue in three ways:
Black box problem: So-called “Good Old Fashioned AI” (GOFAI) uses decision-trees to come from a certain set of inputs to an output. In principle, this approach can be transparent: a human can trace an outcome back to the inputs, following the logical steps taken. In contrast, modern machine learning models use hidden layers of information to make decisions, which makes it hard— or impossible—to trace an outcome back to its inputs. This generates the black box problem: the decision-making process between inputs and outputs is not visible or understandable to human beings.
Expert rule: Data-driven systems become increasingly complex, and their design depends on insights from very specialized scientific disciplines. Because of this, scrutinizing the decisions made by these systems requires expert knowledge. Even if a lay person could see everything that goes on inside a system, they would usually not be able to understand it.
Commercial secrets: Many data-driven systems are developed by private companies. These companies often have an interest in keeping information secret or confidential. For instance, they might choose not to disclose information about proprietary algorithms. They might also integrate third-party products into their systems, meaning that these third-party products remain inaccessible for inspection. Finally, companies might have an interest in taking decisions about user experience without informing users, like in the case of shadow banning—partially blocking a user’s content without their knowledge.
A system is a black box for a certain person if the person does not understand and is not in a position to understand the relationship between the inputs and outputs of the system. The ways such systems process information is therefore not transparent to that person, and that person cannot explain the outputs based on the inputs.
There are four major reasons why systems become a black box to a person. The first two are technical in nature and the other two are organizational and legal in nature. Each reason will be explained in more detail below.
Artificial neural networks
Self-learning models
The integration of third-party models
Intellectual property rights
Artificial neural networks are machine learning architectures that are modeled on the way animal brains work. They consist of artificial neurons and connections between these neurons, which resemble synapses in biological brains. The strength of the connections between neurons adjusts as the algorithm learns.
Learning proceeds by considering example inputs and outputs, iteratively tweaking the weights of the connections between neurons until the network creates the desired outputs for each of the corresponding inputs.
The result is a complex web of connections between artificial neurons, characterized by the network’s architecture and the weights of the connections between the neurons. While artificial neural networks are very potent in solving a large range of tasks, from image classification to natural language processing and computer vision, neural network architectures are difficult to interpret.
To evaluate outputs, ground truth datasets are used. These datasets contain ideal expected results. Despite its appeal to objectivity, these datasets are often the result of subjective processes. This can involve hand-labeling example data points and the way in which ideal expected results are arrived at often lacks transparency.
Self-learning models integrate new data into the model in an automatic way. This can be useful to preserve the accuracy of a model over time.
Consider a model predicting foot traffic. Predictions based on data collected before the COVID-19 pandemic will be inadequate to predict foot traffic during a lockdown situation. Self-learning models can preserve performance by regularly or even continuously integrating new data into the model, in this case including how foot traffic declines because of lockdown measures.
Yet self-learning models can also have the opposite effect. If the training process is automated, performance may decline. For instance, new data may have a lower volume and therefore lead to overfitting. Or new features may be added to the training data that might lead to spurious correlations.
Using third-party models or integrating them with your own models creates transparency and explainability risks. This is because a system may be a black box to you, even when it is transparent and explainable for the third party that built it. If the third party does not provide you with sufficient detail on how their algorithm works, you might introduce a black box system into your organization.
In addition, the integration process itself might jeopardize transparency and explainability. You may have to modify the third-party solution to make it fit your needs. If these modifications are not well documented and communicated, you may undermine transparency and explainability.
Intellectual property rights include copyrights, trade secrets, and trademarks. Many organizations protect their intellectual property to defend against replication of their product or service. Yet for precisely this reason, closed-source intellectual property introduces transparency and explainability risks.
If you use third-party closed source technology, you might not be able to get sufficient insight to understand the way that the technology transforms data. This creates the risk that you are not aware whether the vendor has done their due diligence in creating the technology. This introduces ethical, legal, and compliance risks, and can block independent auditors from certifying your product. In addition, closed-source intellectual property can turn a system that is transparent to you into a black box for users.
There are strategies for making AI more explainable. It is important to note, however, that making AI explainable creates risks of its own. In particular, research has shown that explainability tools can make both users and data scientists over-trust and misread algorithms. This can lead to incorrect assumptions about the model and, curiously, its explanation. As a result, explainability tools can instill a false confidence about deploying the models.
Strikingly, explainability tools have been shown to induce a false sense of confidence even when they were manipulated to show explanations that made no sense. This is not to diminish the importance of explainable AI, but rather to extend the principles that are introduced in this course to explainability tools themselves. In particular, the explanations themselves need to be sufficiently transparent to contribute, rather than detract from, the explainability of a black box system.
As with other ethical risks, you will often have to consider other ethical values as you determine how best to mitigate transparency and explainability risks. Transparency is often conducive to accountability and trust, and therefore an essential feature of data-driven systems. Yet, transparency is not always desirable and might conflict with other ethical values. In particular, it can stand in the way of confidentiality, efficiency, and privacy.
Most basically, transparency is about showing and explaining things; making things visible. However, human relations often need a sense of confidentiality: businesses cannot operate without trade secrets, governments need to secure crucial defense information, and relationships between people often involve confidentiality concerning personal information.
Consider the example of agencies that supervise financial actors, like players in the financial technology sector. On one hand, these agencies might allow for full transparency by disclosing all the information collected and managed by companies. This would create a level playing field and would make sure that no player has an unfair advantage or engages in dubious activities. On the other hand, full transparency might conflict with confidentiality agreements between the company and its clients (for instance, about sensitive transactions), and increase the risk of disclosure of personal information.
Once confidentiality is broken for the sake of transparency, this might damage the trust in certain institutions.
Making our decisions and processes transparent takes a lot of work and can cost significant resources. It can therefore conflict with the goal of operating quickly and efficiently. Transparency can also impose cognitive burdens on an audience. When everything is transparent, it is very hard to focus on the things that really matter. Sometimes, making things transparent can make it even harder for people to understand what is going on. In this way, transparency can conflict with our ability to process information efficiently.
Consider, for instance, the use of AI voice assistants to induce cooperation between humans. Research has shown that voice assistants can be more effective in inducing cooperation than human intermediaries, which would be valuable in cases such as negotiating with malicious actors. However, transparent communication with the voice assistants would require that they disclose their nature, as being artificial rather than human. In this case, increased transparency would defeat the purpose of the voice assistant, thereby reducing its efficiency.
An increase in transparency can promote accountability, but it can thereby also violate privacy. Full transparency would mean who does what is visible, but this also makes it possible to get hold of potentially sensitive personal information.
Consider the case of cryptocurrencies like Bitcoin. Bitcoin’s strength lies in the transparency of its shared ledger. Because all nodes have access to the full transaction history of Bitcoin, each user can trust that every transaction is authentic and accounted for. And yet, this transparency comes with a risk. Even though Bitcoin is pseudonymous (transaction addresses are not directly linked to real names), there are ways in which transactions can be linked to individuals. It, therefore, offers full transparency (in terms of transactions) at the cost of privacy.
Explainable AI is AI that is not a black box. A system is explainable to a person if the relationship between inputs and outputs of the system are or can be understood by that person. There is currently a lot of research under way to make ML algorithms more explainable. These tools can help answer questions such as:
Why did an algorithm make one decision, rather than another?
How can we correct for failures of the algorithm?
To what extent can we trust the algorithm?
There is also an important communication aspect to explainable AI. Whether or not a system is a black box can, again, be different for different people. An algorithm that is explainable to the engineers in your organization who made it may not be explainable to users outside your organization, or even to other people within your organization. Not every algorithm needs to be transparent to every stakeholder. Rather, as we have seen, we need to carefully distinguish what the ultimate goals of transparency and explainability are.
Hence building explainable technology goes far beyond tasking software engineers with applying the right set of tools. Building explainable AI starts by determining the transparency and explainability needs of stakeholders.
There are two main steps involved in determining the transparency and explainability needs of stakeholders:
Understand which decisions in your organization are made by or with the support of algorithms. For external stakeholders such as users, it is often not even clear which decisions are made by algorithms. But even within the organization, the role of algorithms can easily go unnoticed. Pay particular attention to algorithms that raise flags in terms of one of the four sources of transparency and explainability risk: models using opaque architectures such as artificial neural networks; self-learning models; models developed by third parties; and closed-source models.
Identify whether or not certain stakeholders have particular transparency or explainability needs with respect to an algorithm. As we have seen, explainability and transparency are not necessarily valuable in themselves, but they:
Help ensure that decisions are sound.
Preempt biases in decision-making.
Uphold trust in decision-making
Enable self-advocacy.
The result of these steps is a list of decision-making processes undertaken or supported by algorithms that are mapped to the transparency and explainability needs of stakeholders. Based on this analysis, you can adopt one of the following mitigation strategies:
Explanation of how the system works. Recall that explanations may need to look different depending on the stakeholder in question and their explanation needs. Users will often benefit from a narrative explanation outlining the input and output data, including how it is collected and providing a high-level explanation of how data is manipulated by the algorithm. Using visuals like flowcharts is helpful for explaining the process to anyone, not just a lay person. Yet what a meaningful explanation looks like will mostly vary with the need it seeks to address. For instance, to give users an understanding how their behavior influences algorithmic decision-making, they mostly need to know about the input data and how it is collected. By contrast, to enable experts to check whether an algorithm treats different groups fairly, they may require a detailed technical explanation of the model, or even access to the algorithm itself.
Explanation of which factors determined the decision. Another way of approaching the explanation of a black-box system is to focus on which of its inputs drives particular outputs. This is sometimes called the interpretability of a model. There are two kinds of interpretation: global and local. Global interpretations explain to what extent each input factor contributes to the outputs of a machine learning model. By contrast, local interpretations explain to what extent each input factor determines a particular prediction.
Keep humans in the loop. Keeping a human in the loop means that a human operator is involved in the decision-making process. Involvement can take different forms. In some approaches, a human is the ultimate decision maker, based on input from the machine learning system. In others, humans review a subset of decisions, perhaps those that are flagged as potentially problematic. In both approaches, keeping a human in the loop improves transparency and explainability. Because the algorithm is designed from the start in such a way that a human can properly review algorithmic decisions, it is more likely that a wrong decision can be identified and corrected.