Last week we shared some thoughts about bias in algorithms following the furore over exam results. The Scottish government has now published its response to the European Commission's White Paper on AI. The response covers many aspects of AI including facial recognition but also considers the topical issue of bias and algorithms.
In February, the European Commission issued its White Paper (along with a data strategy), which envisaged trustworthy artificial intelligence, based on excellence and trust. Among other aspects of AI, it also said that AI systems should be transparent, traceable and guarantee human oversight. Authorities should be able to test and certify the data used by algorithms in the same way that they check cosmetics, cars or toys. It said that unbiased data is needed to train high-risk systems to perform properly, and to ensure respect of fundamental rights, especially non-discrimination.
The Scottish government welcomes the White Paper and its response talks about algorithms and bias in some detail. It agrees with the European Commission that training data is fundamental to how AI machine learning applications perform, and that “measures should therefore be taken to ensure that, where it comes to the data used to train AI systems, the EU’s values and rules are respected, specifically in relation to safety and existing legislative rules for the protection of fundamental rights”. It also agrees on the importance of the three key considerations listed by the White Paper, which are safety, bias and discrimination, and privacy.
However, the response notes that assessing the quality, diversity, and general fitness for purpose of a training dataset is a complex task, with implications for the skills that will be required from regulators and developers. SMEs are at a double disadvantage, in terms of access to large, high-quality datasets (compared with eg Google and Facebook), and access to the skills and resources required to critically assess them and take remedial measures (such as correcting for bias).
The Scottish government also points out that it is intrinsically difficult to assess whether a training dataset is “good enough”, in terms of how its characteristics will translate into an AI application making decisions that are safe and protect fundamental rights. A biased training dataset is unlikely to lead to good real-world performance, or adequate protection of those rights. But apparent “diversity” in the training data is no guarantee that the resulting application makes non-discriminatory decisions, or is generally fit for purpose. It would also be difficult to establish that a dataset covers all potential “dangerous” scenarios, as those are typically uncovered after the event. As pointed out in the White Paper, there will be a need to also verify the “relevant programming and training methodologies, processes and techniques
used to build, test and validate AI systems” – not only the training data in isolation. This will place further demands on the skills of regulators, particularly if they are required to inspect the details of algorithms and underlying mathematics.
The implementation of remedies for training data will also pose challenges. The Commission suggests that a possible remedy is “re-training the system in the EU in such a way as to ensure that all applicable requirements are met.” However, it is unlikely that training a
system in the EU would be either necessary or sufficient to meet those requirements. Such a remedy also raises the issue of how developers would access appropriate EU datasets for their AI application. The Scottish government queries whether such data would be made available through the Commission, or if it would be necessary for firms to procure such datasets from third party companies or collect data independently. This might put certain businesses, such as SMEs, at a disadvantage.
The issue of how to resolve bias in algorithms is not an easy one to resolve. As we pointed out in our previous post, the Home Office stopped using an algorithm to consider visa applications when it was alleged that it was racist, so the issue is far wider than exam results. If you are planning to use an algorithm in your business stop and think about the issues of bias and discrimination at an early stage.
assessing the quality, diversity, and general fitness for purpose of a training dataset is a complex task, with implications for the skills that will be required from regulators and developers