Interesting legislative proposal to make procurement of AI conditional on external checks

Procurement is progressively put in the position of regulating what types of artificial intelligence (AI) are deployed by the public sector (ie taking a gatekeeping function; see here and here). This implies that the procurement function should be able to verify that the intended AI (and its use/foreseeable misuse) will not cause harms—or, where harms are unavoidable, come up with a system to weigh, and if appropriate/possible manage, that risk. I am currently trying to understand the governance implications of this emerging gatekeeping role to assess whether procurement is best placed to carry it out.

In the context of this reflection, I found a very useful recent paper: M E Kaminski, ‘Regulating the Risks of AI’ (2023) 103 Boston University Law Review forthcoming. In addition to providing a useful critique of the treatment of AI harms as risk and of the implications in terms of the regulatory baggage that (different types of) risk regulation implies, Kaminski provides an overview of a very interesting legislative proposal: Washington State’s Bill SB 5116.

Bill SB 5116 is a proposal for new legislation ‘establishing guidelines for government procurement and use of automated decision systems in order to protect consumers, improve transparency, and create more market predictability'. The governance approach underpinning the Bill is interesting in two respects.

First, the Bill includes a ban on certain uses of AI in the public sector. As Kaminski summarises: ‘Sec. 4 of SB 5116 bans public agencies from engaging in (1) the use of an automated decision system that discriminates, (2) the use of an “automated final decision system” to “make a decision impacting the constitutional or legal rights… of any Washington resident” (3) the use of an “automated final decision system…to deploy or trigger any weapon;” (4) the installation in certain public places of equipment that enables AI-enabled profiling, (5) the use of AI-enabled profiling “to make decisions that produce legal effects or similarly significant effects concerning individuals’ (at 66, fn 398).

Second, the Bill subjects the procurement of the AI to approval by the director of the office of the chief information officer. As Kaminski clarifies: ‘The bill’s assessment process is thus more like a licensing scheme than many proposed impact assessments in that it envisions a central regulator serving a gatekeeping function (albeit probably not an intensive one, and not over private companies, which aren’t covered by the bill at all). In fact, the bill is more protective than the GDPR in that the state CIO must make the algorithmic accountability report public and invite public comment before approving it’ (at 66, references omitted).

What the Bill does, then, is to displace the gatekeeping role from the procurement function itself to the data protection regulator. It also sets the specific substantive criteria the regulator has to apply in deciding whether to authorise the procurement of the AI.

Without getting into the detail of the Washington Bill, this governance approach seems to have two main strengths over the current emerging model of procurement self-regulation of the gatekeeping role (in the EU).

First, it facilitates a standardisation of the substantive criteria to be applied in assessing the potential harms resulting from AI adoption in the public sector, with a concentration on the specific characteristics of decision-making in this context. Importantly, it creates a clear area of illegality. Some of it is in line with eg the prohibition of certain AI uses in the Draft EU AI Act (profiling), or in the GDPR (prohibition of solely automated individual-decision making, including profiling — although it may go beyond it). Moreover, such an approach would allow for an expansion of prohibited uses in the specific context of the public sector, which the EU AI Act mostly fails to tackle (see here). It would also allow for the specification of constraints applicable to the use of AI by the public sector, such as a heightened obligation to provide reasons (see M Fink & M Finck, ‘Reasoned A(I)dministration: Explanation Requirements in EU Law and the Automation of Public Administration‘ (2022) 47(3) European Law Review 376-392).

Second, it introduces an element of external (independent) verification of the assessment of potential AI harms. I think this is a crucial governance point because most proposals relying on the internal (self) assessment by the procurement team fail to consider the extent to which such approach ensures (a) adequate resourcing (eg specialism and experience in the type of assessment) and (b) sufficient objectivity in the assessment. On the second point, with procurement teams often being told to ‘just go and procure what is needed’, moving to a position of gatekeeper or controller could be too big an ask (depending on institutional aspects that require closer consideration). Moreover, this would be different from other aspects of gatekeeping that procurement has progressively been asked to carry out (also excessively, in my view: see here).

When the procurement function is asked to screen for eg potential contractors’ social or environmental compliance track record, it is usually at arms’ length from those being reviewed (and the rules on conflict of interest are there to strengthen that position). Conversely, when the procurement function is asked to screen for the likely impact on citizens and/or users of public services of an initiative promoted by the operational part of the organisation to which it belongs, things are much more complicated.

That is why some systems (like the US FAR) create elements of separation between the procurement team and those in charge of reviewing eg competition issues (by means of the competition advocate). This is a model reflected in the Washington Bill’s approach to requiring external (even if within the public administration) verification and approval of the AI impact assessment. If procurement is to become a properly functioning gatekeeper of the adoption of AI by the public sector, this regulatory approach (ie having an ‘AI Harms Controller’) seems promising. Definitely a model worth thinking about for a little longer.

Algorithmic transparency: some thoughts on UK's first four published disclosures and the standards' usability

© Fabrice Jazbinsek / Flickr.

The Algorithmic Transparency Standard (ATS) is one of the UK’s flagship initiatives for the regulation of public sector use of artificial intelligence (AI). The ATS encourages (but does not mandate) public sector entities to fill in a template to provide information about the algorithmic tools they use, and why they use them [see e.g. Kingsman et al (2022) for an accessible overview].

The ATS is currently being piloted, and has so far resulted in the publication of four disclosures relating to the use of algorithms in different parts of the UK’s public sector. In this post, I offer some thoughts based on these initial four disclosures, in particular from the perspective of the usability of the ATS in facilitating an enhanced understanding of AI use cases, and accountability for those.

The first four disclosed AI use cases

The ATS pilot has so far published information in two batches (on 1 June and 6 July 2022), comprising the following four AI use cases:

  1. Within Cabinet Office, the GOV.UK Data Labs team piloted the ATS for their Related Links tool; a recommendation engine built to aid navigation of GOV.UK (the primary UK central government website) by providing relevant onward journeys from a content page, with the aim of helping users find useful information and content, aiding navigation.

  2. In the Department for Health and Social Care and NHS Digital, the QCovid team piloted the ATS with a COVID-19 clinical tool used to predict how at risk individuals might be from COVID-19. The tool was developed for use by clinicians in support of conversations with patients about personal risk, and it uses algorithms to combine a number of factors such as age, sex, ethnicity, height and weight (to calculate BMI), and specific health conditions and treatments in order to estimate the combined risk of catching coronavirus and being hospitalised or catching coronavirus and dying. Importantly, “The original version of the QCovid algorithms were also used as part of the Population Risk Assessment to add patients to the Shielded Patient List in February 2021. These patients were advised to shield at that time were provided support for doing so, and were prioritised for COVID-19 vaccination.

  3. The Information Commissioner's Office has piloted the ATS with its Registration Inbox AI, which uses a machine learning algorithm to categorise emails sent to the Information Commissioner's Office’s registration inbox and to send out an auto-reply where the algorithm “detects … a request about changing a business address. In cases where it detects this kind of request, the algorithm sends out an autoreply that directs the customer to a new online service and points out further information required to process a change request. Only emails with an 80% certainty of a change of address request will be sent an email containing the link to the change of address form.”

  4. The Food Standards Agency piloted the ATS with its Food Hygiene Rating Scheme (FHRS) – AI, which is an algorithmic tool to help local authorities to prioritise inspections of food businesses based on their predicted food hygiene rating by predicting which establishments might be at a higher risk of non-compliance with food hygiene regulations. Importantly, the tool is of voluntary use and “it is not intended to replace the current approach to generate a FHRS score. The final score will always be the result of an inspection undertaken by [a local authority] officer.

Harmless (?) use cases

At first glance, and on the basis of the implications of the outcome of the algorithmic recommendation, it would seem that the four use cases are relatively harmless, i.e..

  1. If GOV.UK recommends links to content that is not relevant or helpful, the user may simply ignore them.

  2. The outcome of the QCovid tool simply informs the GPs’ (or other clinicians’) assessment of the risk of their patients, and the GPs’ expertise should mediate any incorrect (either over-inclusive, or under-inclusive) assessments by the AI.

  3. If the ICO sends an automatic email with information on how to change their business address to somebody that had submitted a different query, the receiver can simply ignore that email.

  4. Incorrect or imperfect prioritisation of food businesses for inspection could result in the early inspection of a low-risk restaurant, or the late(r) inspection of a higher-risk restaurant, but this is already a risk implicit in allowing restaurants to open pending inspection; AI does not add risk.

However, this approach could be too simplistic or optimistic. It can be helpful to think about what could really happen if the AI got it wrong ‘in a disaster scenario’ based on possible user reactions (a useful approach promoted by the Data Hazards project). It seems to me that, on ‘worse case scenario’ thinking (and without seeking to be exhaustive):

  1. If GOV.UK recommends content that is not helpful but is confusing, the user can either engage in red tape they did not need to complete (wasting both their time and public resources) or, worse, feel overwhelmed, confused or misled and abandon the administrative interaction they were initially seeking to complete. This can lead to exclusion from public services, and be particularly problematic if these situations can have a differential impact on different user groups.

  2. There could be over-reliance on the QCovid algorithm by (too busy) GPs. This could lead to advising ‘as a matter of routine’ the taking of excessive precautions with significant potential impacts on the day to day lives of those affected—as was arguably the case for some of the citizens included in shielding categories in the earlier incarnation of the algorithm. Conversely, GPs that identified problems in the early use of the algorithm could simply ignore it, thus potentially losing the benefits of the algorithm in other cases where it could have been helpful—potentially leading to under-precaution by individuals that could have otherwise been better safeguarded.

  3. Similarly to 1, the provision of irrelevant and potentially confusing information can lead to waste of resource (e.g. users seeking to change their business registration address because they wrongly think it is a requirement to process their query or, at a lower end of the scale, users having to read and consider information about an administrative process they have no interest in). Beyond that, the classification algorithm could generate loss of queries if there was no human check to verify that the AI classification was correct. If this check takes place anyway, the advantages of automating the sending of the initial email seem rather marginal.

  4. Similar to 2, the incorrect prediction of risk can lead to misuse of resources in the carrying out of inspections by local authorities, potentially pushing down the list of restaurants pending inspection some that are high-risk and that could thus be seen their inspection repeatedly delayed. This could have important public health implications, at least for those citizens using the to be inspected restaurants for longer than they would otherwise have. Conversely, inaccurate prioritisations that did not seem to catch more ‘risky’ restaurants could also lead to local authorities abandoning its use. There is also a risk of profiling of certain types of businesses (and their owners), which could lead to victimisation if the tool was improperly used, or used in relation to restaurants that have been active for a longer period (eg to trigger fresh (re)inspections).

No AI application is thus entirely harmless. Of course, this is just a matter of theoretical speculation—as could also be speculated whether reduced engagement with the AI would generate a second tier negative effect, eg if ‘learning’ algorithms could not be revised and improved on the basis of ‘real-life’ feedback on whether their predictions were or not accurate.

I think that this sort of speculation offers a useful yardstick to assess the extent to which the ATS can be helpful and usable. I would argue that the ATS will be helpful to the extent that (a) it provides information susceptible of clarifying whether the relevant risks have been taken into account and properly mitigated or, failing that (b) it provides information that can be used to challenge the insufficiency of any underlying risk assessments or mitigation strategies. Ultimately, AI transparency is not an end in itself, but simply a means of increasing accountability—at least in the context of public sector AI adoption. And it is clear that any degree of transparency generated by the ATS will be an improvement on the current situation, but is the ATS really usable?

Finding out more on the basis of the ATS disclosures

To try to answer that general question on whether the ATS is usable and serves to facilitate increased accountability, I have read the four disclosures in full. Here is my summary/extracts of the relevant bits for each of them.

GOV.UK Related Links

Since May 2019, the tool has been using an algorithm called node2vec (machine learning algorithm that learns network node embeddings) to train a model on the last three weeks of user movement data (web analytics data). The benefits are described as “the tool … predicts related links for a page. These related links are helpful to users. They help users find the content they are looking for. They also help a user find tangentially related content to the page they are on; it’s a bit like when you are looking for a book in the library, you might find books that are relevant to you on adjacent shelves.

The way the tool works is described in some more detail: “The tool updates links every three weeks and thus tracks changes in user behaviour.” “Every three weeks, the machine learning algorithm is trained using the last three weeks of analytics data and trains a model that outputs related links that are published, overwriting the existing links with new ones.” “The average click through rate for related links is about 5% of visits to a content page. For context, GOV.UK supports an average of 6 million visits per day (Jan 2022). True volumes are likely higher owing to analytics consent tracking. We only track users who consent to analytics cookies …”.

The decision process is fully automated, but there is “a way for publishers to add/amend or remove a link from the component. On average this happens two or three times a month.” “Humans have the capability to recommend changes to related links on a page. There is a process for links to be amended manually and these changes can persist. These human expert generated links are preferred to those generated by the model and will persist.” Moreover, “GOV.UK has a feedback link, “report a problem with this page”, on every page which allows users to flag incorrect links or links they disagree with.” The tool was subjected to a Data Protection Impact Assessment (DPIA), but no other impact assessments (IAs) are listed.

When it comes to risk identification and mitigation, the disclosure indicates: “A recommendation engine can produce links that could be deemed wrong, useless or insensitive by users (e.g. links that point users towards pages that discuss air accidents).” and that, as mitigation: “We added pages to a deny list that might not be useful for a user (such as the homepage) or might be deemed insensitive (e.g. air accident reports). We also enabled publishers or anyone with access to the tagging system to add/amend or remove links. GOV.UK users can also report problems through the feedback mechanisms on GOV.UK.

Overall, then, the risk I had identified is only superficially identified, in that the ATS disclosure does not show awareness of the potential differing implications of incorrect or useless recommendations across the spectrum. The narrative equating the recommendations to browsing the shelves of a library is quite suggestive in that regard, as is the fact that the quality controls are rather limited.

Indeed, it seems that the quality control mechanisms require a high level of effort by every publisher, as they need to check every three weeks whether the (new) related links appearing in each of the pages they publish are relevant and unproblematic. This seems to have reversed the functional balance of convenience. Before the implementation of the tool, only approximately 2,000 out of 600,000 pieces of content on GOV.UK had related links, as they had to be created manually (and thus, hopefully, were relevant, if not necessarily unproblematic). Now, almost all pages have up to five related content suggestions, but only two or three out of 600,000 pages see their links manually amended per month. A question arises whether this extremely low rate of manual intervention is reflective of the high quality of the system, or the reverse evidence of lack of resource to quality-assure websites that previously prevented 98% of pages from having this type of related information.

However, despite the queries as to the desirability of the AI implementation as described, the ATS disclosure is in itself useful because it allows the type of analysis above and, in case someone considers the situation unsatisfactory or would like to prove it further, there are is a clear gateway to (try to) engage the entity responsible for this AI deployment.

QCovid algorithm

The algorithm was developed at the onset of the Covid-19 pandemic to drive government decisions on which citizens to advise to shield, support during shielding, and prioritise for vaccination rollout. Since the end of the shielding period, the tool has been modified. “The clinical tool for clinicians is intended to support individual conversations with patients about risk. Originally, the goal was to help patients understand the reasons for being asked to shield and, where relevant, help them do so. Since the end of shielding requirements, it is hoped that better-informed conversations about risk will have supported patients to make appropriate decisions about personal risk, either protecting them from adverse health outcomes or to some extent alleviating concerns about re-engaging with society.

In essence, the tool creates a risk calculation based on scoring risk factors across a number of data fields pertaining to demographic, clinical and social patient information.“ “The factors incorporated in the model include age, ethnicity, level of deprivation, obesity, whether someone lived in residential care or was homeless, and a range of existing medical conditions, such as cardiovascular disease, diabetes, respiratory disease and cancer. For the latest clinical tool, separate versions of the QCOVID models were estimated for vaccinated and unvaccinated patients.

It is difficult to assess how intensely the tool is (currently) used, although the ATS indicates that “In the period between 1st January 2022 and 31st March 2022, there were 2,180 completed assessments” and that “Assessment numbers often move with relative infection rate (e.g. higher infection rate leads to more usage of the tool).“ The ATS also stresses that “The use of the tool does not override any clinical decision making but is a supporting device in the decision making process.” “The tool promotes shared decision making with the patient and is an extra point of information to consider in the decision making process. The tool helps with risk/benefit analysis around decisions (e.g. recommendation to shield or take other precautionary measures).

The impact assessment of this tool is driven by those mandated for medical devices. The description is thus rather technical and not very detailed, although the selected examples it includes do capture the possibility of somebody being misidentified “as meeting the threshold for higher risk”, as well as someone not having “an output generated from the COVID-19 Predictive Risk Model”. The ATS does stress that “As part of patient safety risk assessment, Hazardous scenarios are documented, yet haven’t occurred as suitable mitigation is introduced and implemented to alleviate the risk.” That mitigation largely seems to be that “The tool is designed for use by clinicians who are reminded to look through clinical guidance before using the tool.

I think this case shows two things. First, that it is difficult to understand how different parts of the analysis fit together when a tool that has had two very different uses is the object of a single ATS disclosure. There seems to be a good argument for use case specific ATS disclosures, even if the underlying AI deployment is the same (or a closely related one), as the implications of different uses from a governance perspective also differ.

Second, that in the context of AI adoption for healthcare purposes, there is a dual barrier to accessing relevant (and understandable) information: the tech barrier and the medical barrier. While the ATS does something to reduce the former, the latter very much remains in place and perhaps turn the issue of trustworthiness of the AI to trustworthiness of the clinician, which is not necessarily entirely helpful (not only in this specific use case, but in many other one can imagine). In that regard, it seems that the usability of the ATS is partially limited, and more could be done to increase meaningful transparency through AI-specific IAs, perhaps as proposed by the Ada Lovelace Institute.

In this case, the ATS disclosure has also provided some valuable information, but arguably to a lesser extent than the previous case study.

ICO’s Registration Inbox AI

This is a tool that very much resembles other forms of email classification (e.g. spam filters), as “This algorithmic tool has been designed to inspect emails sent to the ICO’s registration inbox and send out autoreplies to requests made about changing addresses. The tool has not been designed to automatically change addresses on the requester’s behalf. The tool has not been designed to categorise other types of requests sent to the inbox.

The disclosure indicates that “In a significant proportion of emails received, a simple redirection to an online service is all that is required. However, sifting these types of emails out would also require time if done by a human. The algorithm helps to sift out some of these types of emails that it can then automatically respond to. This enables greater capacity for [Data Protection] Fees Officers in the registration team, who can, consequently, spend more time on more complex requests.” “There is no manual intervention in the process - the links are provided to the customer in a fully automated manner.

The tool has been in use since May 2021 and classifies approximately 23,000 emails a month.

When it comes to risk identification and mitigation, the ATS disclosure stresses that “The algorithmic tool does not make any decisions, but instead provides links in instances where it has calculated the customer has contacted the ICO about an address change, giving the customer the opportunity to self-serve.” Moreover, it indicates that there is “No need for review or appeal as no decision is being made. Incorrectly classified emails would receive the default response which is an acknowledgement.” It further stresses that “The classification scope is limited to a change of address and a generic response stating that we have received the customer’s request and that it will be processed within an estimated timeframe. Incorrectly classified emails would receive the default response which is an acknowledgement. This will not have an impact on personal data. Only emails with an 80% certainty of a change of address request will be sent an email containing the link to the change of address form.”

In my view, this disclosure does not entirely clarify the way the algorithm works (e.g. what happens to emails classified as having requested information on change of address? Are they ‘deleted’ from the backlog of emails requiring a (human) non-automated response?). However, it does provide sufficient information to further consolidate the questions arising from the general description. For example, it seems that the identification of risks is clearly partial in that there is not only a risk of someone asking for change of address information not automatically receiving it, but also a risk of those asking for other information receiving the wrong information. There is also no consideration of additional risks (as above), and the general description makes the claim of benefits doubtful if there has to be a manual check to verify adequate classification.

The ATS disclosure does not provide sufficient contact information for the owner of the AI (perhaps because they were contracted on limited after service terms…), although there is generic contact information for the ICO that could be used by someone that considered the situation unsatisfactory or would like to prove it further.

Food Hygiene Rating Scheme – AI

This tool is also based on machine learning to make predictions. “A machine learning framework called LightGBM was used to develop the FHRS AI model. This model was trained on data from three sources: internal Food Standards Agency (FSA) FHRS data, publicly available Census data from the 2011 census and open data from HERE API. Using this data, the model is trained to predict the food hygiene rating of an establishment awaiting its first inspection, as well as predicting whether the establishment is compliant or not.” “Utilising the service, the Environmental Health Officers (EHOs) are provided with the AI predictions, which are supplemented with their knowledge about the businesses in the area, to prioritise inspections and update their inspection plan.”

Regarding the justification for the development, the disclosure stresses that “the number of businesses classified as ‘Awaiting Inspection’ on the Food Hygiene Rating Scheme website has increased steadily since the beginning of the pandemic. This has been the key driver behind the development of the FHRS AI use case.” “The objective is to help local authorities become more efficient in managing the hygiene inspection workload in the post-pandemic environment of constrained resources and rapidly evolving business models.

Interestingly, the disclosure states that the tool “has not been released to actual end users as yet and hence the maintenance schedule is something that cannot be determined at this point in time (June 2022). The Alpha pilot started at the beginning of April 2022, wherein the end users (the participating Local Authorities) have access to the FHRS AI service for use in their day-to-day workings. This section will be updated depending on the outcomes of the Alpha Pilot ...” It remains to be seen whether there will be future updates on the disclosure, but an error in copy-pasting in the ATS disclosure makes it contain the same paragraph but dated February 2022. This stresses the need to date and reference (eg v.1, v.2) the successive versions of the same disclosure, which does not seem to be a field of the current template, as well as to create a repository of earlier versions of the same disclosure.

The section on oversight stresses that “the system has been designed to provide decision support to Local Authorities. FSA has advised Local Authorities to never use this system in place of the current inspection regime or use it in isolation without further supporting information”. It also stresses that “Since there will be no change to the current inspection process by introducing the model, the existing appeal and review mechanisms will remain in place. Although the model is used for prioritisation purposes, it should not impact how the establishment is assessed during the inspection and therefore any challenges to a food hygiene rating would be made using the existing FHRS appeal mechanism.”

The disclosure also provides detailed information on IAs: “The different impact assessments conducted during the development of the use case were 1. Responsible AI Risk Assessment; 2. Stakeholder Impact Assessment; [and] 3. Privacy Impact Assessment.” Concerning the responsible AI risk assessment, in addition to a personal data issue that should belong in the DPIA, the disclosure reports three identified risks very much in line with the ones I had hinted at above: “2. Potential bias from the model (e.g. consistently scoring establishments of a certain type much lower, less accurate predictions); 3. Potential bias from inspectors seeing predicted food hygiene ratings and whether the system has classified the establishment as compliant or not. This may have an impact on how the organisation is perceived before receiving a full inspection; 4. With the use of AI/ML there is a chance of decision automation bias or automation distrust bias occurring. Essentially, this refers to a user being over or under reliant on the system leading to a degradation of human-reasoning.”

The disclosure presents related mitigation strategies as follows: “2. Integration of explainability and fairness related tooling during exploration and model development. These tools will also be integrated and monitored post-alpha testing to detect and mitigate potential biases from the system once fully operational; 3. Continuously reflect, act and justify sessions with business and technical subject matter experts throughout the delivery of the project, along with the use of the three impact assessments outlined earlier to identify, assess and manage project risks; 4. Development of usage guidance for local authorities specifically outlining how the service is expected to be used. This document also clearly states how the service should not be used, for example, the model outcome must not be the only indicator used when prioritising businesses for inspection.

In this instance, the ATS disclosure is in itself useful because it allows the type of analysis above and, in case someone considers the situation unsatisfactory or would like to prove it further, there are is a clear gateway to (try to) engage the entity responsible for this AI deployment. It is also interesting to see that the disclosure specifies that the private provider was engaged “As well as [in] a development role [… to provide] Responsible AI consulting and delivery services, including the application of a parallel Responsible AI sprint to assess risk and impact, enable model explainability and assess fairness, using a variety of artefacts, processes and tools”. This is clearly reflected in the ATS disclosure and could be an example of good practice where organisations lack that in-house capability and/or outsource the development of the AI. Whether that role should fall with the developer, or should rather be separate to avoid organisational conflicts of interest is a discussion for another day.

Final thoughts

There seems to be a mixed picture on the usability of the ATS disclosures, with some of them not entirely providing (full) usability, or a clear pathway to engage with the specific entity in charge of the development of the algorithmic tool, specifically if it was an outsourced provider. In those cases, the public authority that has implemented the AI (even if not the owner of the project) will have to deal with any issues arising from the disclosure. There is also a mixed practice concerning linking to resources other than previously available (open) data (eg open source code, data sources), with only one project (GOV.UK) including them in the disclosures discussed above.

It will be interesting to see how this assessment scales up (to use a term) once disclosures increase in volume. There is clearly a research opportunity arising as soon as more ATS disclosures are published. As a hypothesis, I would submit that disclosure quality is likely to reduce with volume, as well as with the withdrawal of whichever support the pilot phase has meant for those participating institutions. Let’s see how that empirical issue can be assessed.

The other reflection I have to offer based on these first four disclosures is that there are points of information in the disclosures that can be useful, at least from an academic (and journalistic?) perspective, to assess the extent to which the public sector has the capabilities it needs to harness digital technologies (more on that soon in this blog).

The four reviewed disclosures show that there was one in-house development (GOV.UK), while the other ones were either procured (QCovid, which disclosure includes a redacted copy of the contract), or contracted out, perhaps even directly awarded (ICO email classifier FSA FHRS - AI). And there are some in between the line indications that some of the implementations may have been relatively randomly developed, unless there was strong pre-existing reliable statistical data (eg on information requests concerning change of business address). Which in itself triggers questions on the procurement or commissioning strategy developed by institutions seeking to harness AI potential.

From this perspective, the ATS disclosures can be a useful source of information on the extent to which the adoption of AI by the public sector depends as strongly on third party capabilities as the literature generally hypothesises or/and is starting to demonstrate empirically.

The importance of procurement for public sector AI uptake

In case there was any question on the importance and central role of public procurement for the uptake of artificial intelligence (AI) by the public sector (there wasn’t, though), two recent policy reports confirm that this is the case, at the very least in the European context.

AI Watch’s must-read ‘European landscape on the use of Artificial Intelligence by the Public Sector’ (1 June 2022) makes the point very clearly by reference to the analysis of AI strategies adopted by 24 EU Member States: ‘the procurement of AI technologies or the increased collaboration with innovative private partners is seen as an important way to facilitate the introduction of AI within the public sector. Guidance on how to stimulate and organise AI procurement by civil servants should potentially be strengthened and shared among Member States’ (at 26). Concerning guidance, the report refers to the European Commission’s supported process of developing standard contractual clauses for the procurement of AI (see here), and there is also a twin AI Watch Handbook for the adoption of AI by the public sector (25 May 2022) that includes a recommendation on procurement guidance (‘Promote the development of multilingual guidelines, criteria and tools for public procurement of AI solutions in the public sector throughout Europe‘, recommendation 2.5, and details at 34-35).

The European landscape report provides some more interesting detail on national strategies considering AI procurement adaptations.

The need to work together with the private sector in this area is repeatedly stressed. However, strategies mention that historically it has been difficult for innovative companies to work together with government authorities due to cumbersome procurement regulations. In this area, several strategies (12, 50%) [though note the table below indicates 13, rather than 12 strategies] come up with new policy initiatives to improve the procurement processes. The Spanish strategy, for example, mentions that new innovative public procurement mechanisms will be introduced to help the procurement of new solutions from the market, while the Maltese government describes how existing public procurement processes will be changed to facilitate the procurement of emerging technologies such as AI. The Dutch and Czech strategies mention that hackathons for public sector AI will be introduced to assist in the procurement of AI. Civil servants will be given training and awareness in procurement to assist them in this process, something that is highlighted in the Estonian strategy. The French strategy stresses that current procurement regulation already provides a lot of freedom for innovative procurement but that because of risk aversion present within public administrations all possibilities are not taken into consideration (at 25-26, emphasis in the original).

Own elaboration, based on Table 7 in the AI Watch report.

There is also an interesting point on the need to create internal public sector AI capabilities: “Some strategies say that the public organisations should work more together with private organisations (where the missing skillsets are present), either through partnerships or by procurement. On the one hand, this is an extremely important and promising shift in the public sector that more and more must move towards a networking perspective. In fact, the complexity and variety of skills required by AI cannot be always completely internalised. On the other hand, such partnerships and procurement still require a baseline in expertise in AI within the public sector staff to avoid common mistakes or dependency on external parties” (at 23, emphasis added).

Given the strategic importance of procurement, as well as the need to upskill the procurement workforce and to build additional AI capacity in the public sector to manage procurement process, this is an area of research and policy that will only increase in relevance in the near and longer term.

This same direction of travel is reflected in the also recent UK's Central Digital and Data Office ‘Transforming for a digital future: 2022 to 2025 roadmap for digital and data’ (9 June 2022). One of its main aspirations is to generate ‘Significant savings by leveraging government’s combined purchasing power and reducing duplicative procurement, to shift to a “buy once, use many times” approach to technology’. This should be achieved by the horizontal promotion of ‘a “buy once, use many times” approach to technology, including by making use of a common code, pattern and architecture repository for government’. Implicitly, this will also require a review of procurement policies and practices.

Importantly—and potentially problematically—it will also raise the stakes of AI procurement, in particular if the roll-out of the ‘bought once’ technology is rushed and its negative impacts or implications can only be identified once it has already been propagated, or in relation to some implementations only. Avoiding this will require very careful IA impact assessments, as well as piloting and scalability approaches that have strong risk-management systems embedded by design.

As always, this will be an area fun to keep an eye on.

The 'NHS Food Scanner' app as a springboard to explore the regulation of public sector recommender systems

In England, the Department of Health and Social Care (DHSC) offers an increasingly wide range of public health-related apps. One of the most recently launched is the ‘Food Scanner’, which aims to provide ‘swap suggestions, which means finding healthier choices for your family is easier than ever!’.

This is part of a broader public health effort to tackle, among other issues, child obesity, and is currently supported by a strong media push aimed primarily at parents. As the parent of two young children, this clearly caught my attention.

The background for this public health intervention is clear:

Without realising it, we are all eating too much sugar, saturated fat and salt. Over time this can lead to harmful changes on the inside and increases the risk of serious diseases in the future. Childhood obesity is a growing issue with figures showing that in England, more than 1 in 4 children aged 4-to 5-years-old and more than 1 in 3 children aged 10 and 11-years-old are overweight or obese.

The Be Food Smart campaign empowers families to take control of their diet by making healthier food and drink choices. The free app works by scanning the barcode of products, revealing the total sugar, saturated fat and salt inside and providing hints and tips adults plus fun food detectives activities for kids.

No issues with that. My family and myself could do with a few healthier choices. So I downloaded the app and started playing around.

As I scanned a couple of (unavoidably) branded products from the cupboard, I realised that the swaps were not for generic, alternative, healthier products, but also for branded products (often of a different brand). While this has the practical advantage of specifying the recommended healthier alternative in an ‘actionable’ manner for the consumer, this made my competition lawyer part of the brain uneasy.

The proposed swaps were (necessarily) ranked and limited, with a ‘top 3’ immediately on display, and with a possibility to explore further swaps not too easy to spot (unless you scrolled down to the bottom). The different offered swaps also had a ‘liked’ button with a counter (still in very low numbers, probably because the app is very new), but those ‘likes’ did not seem to establish ranking (or alter it?), as lower ranked items could have higher like counts (in my limited experiment).

I struggled to make sense of how products are chosen and presented. This picked my interest, so I looked at how the swaps ‘work’.

The in-app information explained that:

How do we do this?

We look into 3 aspects of the product that you have scanned:
1) Product name; so we can try and find similar products based on the words used within the name.
2) Ingredients list; so we can try and find similar products based on the ingredients of the product you have scanned.
3) Pack size; finally we look into the size of the product you have scanned so that, if have scanned a 330ml can, we can try and show you another can-sized product rather than a 1 litre bottle.

How are they ordered?

We have a few rules as to what we show within the top 3. We reserve spaces for:
1) The same manufacturer; if you have scanned a particular brand we will do our best to try and find a healthier version of that same brand which qualifies for a good choice badge.
2) The same supermarket; if you have scanned a supermarket product we will again do our best to show you an alternative from the same store.
3) Partner products; there are certain products which team up with Change4life that we will try and show if they match the requirements of the products you have scanned.

I could see that convenience and a certain element of ‘competition neutrality’ were clearly at play, but a few issues bothered me, especially as the interaction between manufacturer/supermarket is not too clear and there is a primary but nebulous element of preferencing that I was not expecting in an app meant to provide product-based information. I could see myself spending the night awake, trying to find out how that ‘partnership’ is structured, what are the conditions for participating, if there are any financial flows to the Department and/or to partner organisations, etc.

I also realised some quirks or errors in the way information is processed and presented by the Food Scanner app, such as the exact same product (in different format) being assigned different ‘red light’ classifications (see the Kellogg’s Corn Flakes example on the side bar). At a guess, it could be that these divergences come from the fact that there is no single source for the relevant information (it would seem that ‘The nutrient data provided in the app is supplied by Brandbank and FoodSwitch’) and that there is not an entity overseeing the process and curating the data as necessary. In fact, DHSC’s terms and conditions for the Food Scanner app (at 6.10) explicitly state that ‘We do not warrant that any such information is true or accurate and we exclude all liability in respect of the accuracy, completeness, fitness for purpose or legality of that information’ . Interesting…

It is also difficult to see how different elements of the red light system (ie sugar vs saturated fat vs salt) are subject to trade-offs as eg, sometimes, a red/green/yellow product is recommended swapping with a yellow/yellow/yellow product. Working out the scoring system behind such recommendations seems difficult, as there will necessarily be a trade off between limiting (very) high levels of one of the elements against recommending products that are ‘not very healthy’ on all counts. There has to be a system behind this — in the end, there has to be an algorithm underpinning the app. But how does it work and what science informs it?

These are all questions I am definitely interested in exploring. However, I called it a night and planned to look for some help to investigate this properly (a small research project is in the making and I have recruited a fantastic research associate — keep an eye on the blog for more details). For now, I can only jot down a few thoughts on things that will be interesting to explore, to which I really have no direct answers.

The Food Scanner is clearly a publicly endorsed (and owned? developed?) recommender system. However, using a moderate research effort, it is very difficult to access useful details on how it works. There is no published algorithmic transparency template (that I could find). The in-app explanations of how the recommender system works raise more questions than they answer.

There is also no commitment by the DHSC to the information provided being ‘true or accurate’, not to mention complete. This displaces the potential liability and all the accountability for the information on display to (a) Brandbank, a commercial entity within the multinational Nielsen conglomerate, and to (b) Foodswitch, a data-technology platform developed by The George Institute for Global Health. The role of these two institutions, in particular concerning the ‘partnership’ between manufacturers and Change4life (now ‘Better Health’ and, effectively, the Office for Health Improvement & Disparities in the DHSC?), is unclear. It is also unclear whether the combination of the datasets operated by both entities is capable of providing a sufficiently comprehensive representation of the products effectively available in England and, in any case, it seems clear to me that there is a high risk (or certainty) that non mass production/consumption ‘healthy products’ are out of the equation. How this relates to broader aspects of competition, but also of public health policy, can only raise questions.

Additionally, all of this raises quite a few issues from the perspective of the trustworthiness that this type of app can command, as well as the broader competition law implications resulting from the operation of the Food Scanner.

And I am sure that more and more questions will come to mind as I spend more time obsessing about it.

Beyond the specificities of the case, it seems to me that the NHS Food Scanner app is a good springboard to explore the regulation of public sector recommender systems more generally — or, rather, some of the risks implicit in the absence of specific regulation and the difficulties in applying standard regulatory mechanisms (and, perhaps, especially competition law) in this context. Hopefully, there will be some interesting research findings to report by the summer. Stay tuned, and keep healthy!

Recording of webinar on 'Digitalization and AI decision-making in administrative law proceedings'

The Centre for Global Law and Innovation of the University of Bristol Law School and the Faculty of Law at Universidade Católica Portuguesa co-organised an online workshop to discuss emerging issues in digitalization and AI decision-making in administrative law proceedings. I had the great pleasure of chairing it and I think quite a few important issues for further discussion and research were identified. The speakers kindly agreed to share a recording of the session (available here), of which details follow:

Digitalization and AI decision-making in administrative law proceedings

This is a hot area of legal and policy development that has seen an acceleration in the context of the covid-19 pandemic. Emerging research finds points of friction in the simple transposition of administrative law and existing procedures to the AI context, as well as challenges and shortcomings in the judicial review of decisions supported (or delegated) to an AI.

While more and more attention is paid to the use of AI by the public sector, key regulatory proposals such as the European Commission’s Proposal for an Artificial Intelligence Act would largely leave this area to (self)regulation via codes of practice, with the exception of public assistance benefits and services. Self-regulation is also largely the approach taken by the UK in its Guide to using artificial intelligence in the public sector, and the UK courts seem reluctant to engage with the technology underpinning automated decision-making. It is thus arguable that a regulatory gap is increasingly visible and that new solutions and regulatory approaches are required.

The panellists in this workshop covered a range of topics concerning transparency, data protection, automation of decision-making, and judicial review. The panel included (in order of participation):

• Dr Marta Vaz Canavarro Portocarrero de Carvalho, Assistant Professor at the Faculty of Law of Universidade Católica Portuguesa, specialising in administrative law, and member of the Centro de Arbitragem Administrativa (Portuguese Administrative Law Arbitration Centre).

• Dr Filipa Calvão, President of the Comissão Nacional de Proteção de Dados (Portuguese Data Protection Authority) since 2012, and Associate Professor at the Faculty of Law of Universidade Católica Portuguesa.

• Dr Pedro Cerqueira Gomes, Assistant Professor at Universidade Católica Portuguesa and Lawyer at Cerqueira Gomes & Associados, RL, specialising in administrative law and public procurement, and author of EU Public Procurement and Innovation - the innovation partnership procedure and harmonization challenges (Edward Elgar 2021).

• Mr Kit Fotheringham, Teaching Associate and postgraduate research student at the University of Bristol Law School. His doctoral thesis is on administrative law, specifically relating to the use of algorithms, machine learning and other artificial intelligence technologies by public bodies in automated decision-making procedures.

Where does the proposed EU AI Act place procurement?

Thinking about some of the issues raised in the earlier post ‘Can the robot procure for you?,’ I have now taken a close look at the European Commission’s Proposal for an Artificial Intelligence Act (AIA) to see how it approaches the use of AI in procurement procedures. It may (not) come as a surprise that the AI Act takes an extremely light-touch approach to the regulation of AI uses in procurement and simply subjects them to (yet to be developed) voluntary codes of conduct. I will detail my analysis of why this is the case in this post, as well as some reasons why I do not find it satisfactory.

Before getting to the details, it is worth stressing that this is reflective of a broader feature of the AIA: its heavy private sector orientation. When it comes to AI uses by the public sector, other than prohibiting some massive surveillance by the State (both for law enforcement and to generate a system of social scoring) and classifying as high-risk the most obvious AI uses by the law enforcement and judicial authorities (all of which are important, of course), the AIA remains silent on the use of AI in most administrative procedures, with the only exception of those concerning social benefits.

This approach could be generally justified by the limits to EU competence and, in particular, those derived from the principle of administrative self-organisation of the Member States. However, given the very broad approach taken by the Commission on the interpretation and use of Article 114 TFEU (which is the legal basis for the AIA, more below), this is not entirely consistent. It could rather be that the specific uses of AI by the public sector covered in the proposal reflect the increasingly well-known problematic uses of (biased) AI solutions in narrow aspects of public sector activity, rather than a broader reflection on the (still unknown, or still unimplemented) uses that could be problematic.

While the AIA is ‘future-proofed’ by including criteria for the inclusion of further use cases in its ‘high-risk’ category (which determines the bulk of compliance obligations), it is difficult to see how those criteria are suited to a significant expansion of the regulatory constraints to AI uses by the public sector, including in procurement. Therefore, as a broader point, I submit that the proposed AIA needs some revision to make it more suited to the potential deployment of AI by the public sector. To reflect on that, I am co-organising a webinar on ’Digitalization and AI decision-making in administrative law proceedings’, which will take place on 15 Nov 2021, 1pm UK (save the date, registration and more details here). All welcome.

Background on the AIA

Summarising the AIA is both difficult and has already been done (see eg this quick explainer of the Centre for Data Innovation, and for an accessible overview of the rationale and regulatory architecture of the AIA, this master class by Prof Christiane Wendehorst). So, I will just highlight here a few issues linked to the analysis of procurement’s position within its regulatory framework.

The AIA seeks to establish a proportionate approach to the regulation of AI deployment and use. While its primary concern is with the consolidation of the EU Single Digital Market and the avoidance of regulatory barriers to the circulation of AI solutions, its preamble also points to the need to ensure the effectiveness of EU values and, crucially, the fundamental rights in the Charter of Fundamental Rights of the EU.

Importantly for the purposes of our discussion, recital (28) AIA stresses that ‘The extent of the adverse impact caused by the AI system on the fundamental rights protected by the Charter is of particular relevance when classifying an AI system as high-risk. Those rights include ... right to an effective remedy and to a fair trial [Art 47 Charter] … [and] right to good administration {Art 41 Charter]’.

The AIA seeks to create such a proportionate approach to the regulation of AI by establishing four categories of AI uses: prohibited, high-risk, limited risk requiring transparency measures, and minimal risk. The two categories that carry regulatory constraints or compliance obligations are those concerning high-risk (Arts 8-15 AIA), and limited risk requiring transparency measures (Art 52 AIA, which also applies to some high-risk AI). Minimal risk AI uses are left unregulated, although the AIA (Art 69) seeks to promote the development of codes of conduct intended to foster voluntary compliance with the requirements applicable to high-risk AI systems.

Procurement within the AIA

Procurement AI practices could not be classified as prohibited uses (Art 5 AIA), except in the difficult to imagine circumstances in which they deployed subliminal techniques. It is also difficult to see how they could fall under the regime applicable to uses requiring special transparency (Art 52) because it only applies to AI systems intended to interact with natural persons, which must be ‘designed and developed in such a way that natural persons are informed that they are interacting with an AI system, unless this is obvious from the circumstances and the context of use.’ It would not be difficult for public buyers using external-facing AI solutions (eg chatbots seeking to guide tenderers through their e-submissions) to make it clear that the tenderers are interacting with an AI solution. And, even if not, the transparency obligations are rather minimal.

So, the crux of the issue rests on whether procurement-related AI uses could be classified as high-risk. This is regulated in Art 6 AIA, which cross-refers to Annex III AIA. The Annex contains a numerus clausus of high-risk AI uses, which is however susceptible of amendment under the conditions specified in Art 7 AIA. Art 6/Annex III do not contain any procurement-related AI uses. The only type of AI use linked to administrative procedures concerns ‘AI systems intended to be used by public authorities or on behalf of public authorities to evaluate the eligibility of natural persons for public assistance benefits and services, as well as to grant, reduce, revoke, or reclaim such benefits and services’ (Annex III(5)(a) AIA).

Clearly, then, procurement-related AI uses are currently left to the default category of those with minimal risk and, thus, subjected only to voluntary self-regulation via codes of conduct.

Could this change in the future?

Art 7 AIA establishes the following two cumulative criteria: (a) the AI systems are intended to be used in any of the areas listed in points 1 to 8 of Annex III; and (b) the AI systems pose a risk of harm to the health and safety, or a risk of adverse impact on fundamental rights, that is, in respect of its severity and probability of occurrence, equivalent to or greater than the risk of harm or of adverse impact posed by the high-risk AI systems already referred to in Annex III.

The first hurdle in getting procurement-related AI uses included in Annex III in the future is formal and concerns the interpretation of the categories listed therein. There are only two potential options: nesting them under uses related to ‘Access to and enjoyment of essential private services and public services and benefits’, or uses related to ‘Administration of justice and democratic processes’. It could (theoretically) be possible to squeeze them in one of them (perhaps the latter easier than the former), but this is by no means straightforward and, given the existing AI uses in each of the two categories, I would personally be disinclined to engage in such broad interpretation.

Even if that hurdle was cleared, the second hurdle is also challenging. Art 7(2) AIA establishes the criteria to assess that an AI use poses a sufficient ‘risk of adverse impact on fundamental rights’. Of those criteria, there are three that in my view would make it very difficult to classify procurement-related AI uses as high-risk. Those criteria require the European Commission to consider:

(c) the extent to which the use of an AI system has already caused … adverse impact on the fundamental rights or has given rise to significant concerns in relation to the materialisation of such … adverse impact, as demonstrated by reports or documented allegations submitted to national competent authorities;

(d) the potential extent of such harm or such adverse impact, in particular in terms of its intensity and its ability to affect a plurality of persons;

(e) the extent to which potentially harmed or adversely impacted persons are dependent on the outcome produced with an AI system, in particular because for practical or legal reasons it is not reasonably possible to opt-out from that outcome;

(g) the extent to which the outcome produced with an AI system is easily reversible …;

Meeting these criteria would require for the relevant AI systems to basically be making independent or fully automated decisions (eg on award of contract, or exclusion of tenderers), so that their decisions would be seen to affect the effectiveness of Art 41 and 47 Charter rights; as well as a (practical) understanding that those decisions cannot be easily reversed. Otherwise, the regulatory threshold is so high that most likely procurement-related AI uses (screening, recommender systems, support to human decision-making (eg automated evaluation of tenders), etc) are unlikely to be considered to pose a sufficient ‘risk of adverse impact on fundamental rights’.

Could Member States go further?

As mentioned above, one of the potential explanations for the almost absolute silence on the use of AI in administrative procedures in the AIA could be that the Commission considers that this aspect of AI regulation belongs to each of the Member States. If that was true, then Member States could further than the code of conduct self-regulatory approach resulting from the AIA regulatory architecture. An easy approach would be to eg legally mandate compliance with the AIA obligations for high-risk AI systems.

However, given the internal market justification of the AIA, to be honest, I have my doubts that such a regulatory intervention would withstand challenges on the basis of general EU internal market law.

The thrust of the AIA competential justification (under Art 114 TFEU, see point 2.1 of the Explanatory memorandum) is that

The primary objective of this proposal is to ensure the proper functioning of the internal market by setting harmonised rules in particular on the development, placing on the Union market and the use of products and services making use of AI technologies or provided as stand-alone AI systems. Some Member States are already considering national rules to ensure that AI is safe and is developed and used in compliance with fundamental rights obligations. This will likely lead to two main problems: i) a fragmentation of the internal market on essential elements regarding in particular the requirements for the AI products and services, their marketing, their use, the liability and the supervision by public authorities, and ii) the substantial diminishment of legal certainty for both providers and users of AI systems on how existing and new rules will apply to those systems in the Union.

All of those issues would arise if each Member State adopted its own rules constraining the use of AI for administrative procedures not covered by the AIA (either related to procurement or not), so the challenge to that decentralised approach on grounds of internal market law by eg providers of procurement-related AI solutions capable of deployment in all Member States but burdened with uneven regulatory requirements seems quite straightforward (if controversial), especially given the high level of homogeneity in public procurement regulation resulting from the 2014 Public Procurement Package. Not to mention the possibility of challenging those domestic obligation on grounds that they go further than the AIA in breach of Art 16 Charter (freedom to conduct a business), even if this could face some issues resulting from the interpretation of Art 51 thereof.

Repositioning procurement (and other aspects of administrative law) in the AIA

In my view, there is a case to be made for the repositioning of procurement-related AI uses within the AIA, and its logic can apply to other areas of administrative law/activity with similar market effects.

The key issue is that the development of AI solutions to support decision-making in the public sector not only concerns the rights of those directly involved or affected by those decisions, but also society at large. In the case of procurement, eg the development of biased procurement evaluation or procurement recommender systems can have negative social effects via its effects on the market (eg on value for money, to mention the most obvious) that are difficult to identify in single tender procurement decisions.

Moreover, it seems that the public administration is well-placed to comply with the requirements of the AIA for high-risk AI systems as a matter of routine procedure, and the arguments on the need to take a proportionate approach to the regulation of AI so as not to stifle innovation lose steam and barely have any punch when it comes to imposing them on the public sector user. Further, to a large extent, the AIA requirements seem to me mostly aligned with the requirements for running a proper (and challenge proof) eProcurement system, and they would also facilitate compliance with duties of good administration when specific decisions are challenged.

Therefore, on balance, I see no good reason not to expand the list in Annex III AIA to include the use of AI systems in all administrative procedures, and in particular in public procurement and in other regulatory sectors where ex post interventions to correct market distortions resulting from biased AI implementations can simply be practically impossible. I submit that this should be done before its adoption.

Can the robot procure for you? -- a short reflection a propos Chesterman (2021)

origin.jpg

I am reading the very interesting new book by Simon Chesterman, We, the Robots? Regulating Artificial Intelligence and the Limits of the Law (Cambridge University Press, 2021). One of the thought-provoking issues the book addresses is the non-delegability of inherently governmental functions to artificial intelligence (AI). And one of the regulatory analogies used in the book concerns the limits to outsourcing as regulated by procurement law (see pages 109 ff).

The book argues that, for ‘certain decisions, it is necessary to have a human “in-the-loop” actively participating in those decisions’ (109), and states that to reach such determination, a ‘useful analogy is limits on government outsourcing to third parties’ (110).

In that regard, Chesterman leads us to consider the US approach to establishing ‘inherently governmental’ functions for the purposes of outsourcing, on which there is a very detailed and useful Office of Federal Procurement Policy (OFPP) Policy Letter 11–01, Performance of Inherently Governmental and Critical Functions.

I was curious to see whether procurement itself was considered an inherently governmental function not susceptible of outsourcing, and was glad to find this very nuanced specific treatment of the issue (I can hear my US colleagues and friends laughing at my previous ignorance).

Inherently governmental.png

It seems thus arguable that, in the US context and unless the decisions of an AI are somehow re-attributed to a Federal Government employee by some legal fiction, some aspects of procurement decision-making (at contract formation phase, but not only) cannot (yet?) be delegated to an AI (or fully automated. let’s say).

Which prompts me to reflect on what would be the treatment under EU law—and under different Member States’ approaches to constraints based on public functions and the exercise of public powers. This may be the seed for a research paper — or perhaps just a follow-on blogpost — but I would be very interested in any thoughts or comments, particularly if this is an issue someone has already thought or published about! As always, feedback and engagement most welcome at a.sanchez-graells@bristol.ac.uk.

Open Contracting: Where is the UK and What to Expect?

I had the pleasure of delivering a webinar on ‘Open Contracting Data: Where Are We & What Could We Expect?‘ for the Gloucester branch of the Chartered Institute of Procurement & Supply. The webinar assessed the current state of development and implementation of open contracting data initiatives in the UK. It also considered the main principles and goals of open contracting, as well as its practical implementation, and the specific challenges posed by the disclosure of business sensitive information. The webinar also mapped potential future developments and, more generally, reflected on the relevance of an adequate procurement data infrastructure for the deployment of digital technologies and, in particular, AI. The slides are available (via dropbox) and the recording is also accessible through the image below (as well as via dropbox).

As always, feedback most welcome: a.sanchez-graells@bristol.ac.uk.

PS. For some an update on recent EBRD/EU sponsored open contracting initiatives in Greece and Poland, see here.

Challenges and Opportunities for UK Procurement During and After the Pandemic

On 30 April, I delivered a webinar on “Challenges and Opportunities for UK Procurement During and After the Pandemic” for the LUPC/SUPC Annual Conference. The slides are available via SlideShare and the recording is available via YouTube (below). Feedback most welcome: a.sanchez-graells@bristol.ac.uk.

LUPC/SUPC Conference 2020 30th April - Webinar 1 Challenges and Opportunities for UK Procurement During and After the COVID-19 Crisis Led by: Professor Alber...

3 priorities for policy-makers thinking of AI and machine learning for procurement governance

138369750_9f3b5989f9_w.jpg

I find that carrying out research in the digital technologies and governance field can be overwhelming. And that is for an academic currently having the luxury of full-time research leave… so I can only imagine how much more overwhelming it must be for policy-makers thinking about the adoption of artificial intelligence (AI) and machine learning for procurement governance, to identify potential use cases and to establish viable deployment strategies.

Prioritisation seems particularly complicated, as managing such a significant change requires careful planning and paying attention to a wide variety of potential issues. However, getting prioritisation right is probably the best way of increasing the chances of success for the deployment of digital technologies for procurement governance — as well as in other areas of Regtech, such as financial supervision.

This interesting speech by James Proudman (Executive Director of UK Deposit Takers Supervision, Bank of England) on 'Managing Machines: the governance of artificial intelligence', precisely focuses on such issues. And I find the conclusions particularly enlightening:

First, the observation that the introduction of AI/ML poses significant challenges around the proper use of data, suggests that boards should attach priority to the governance of data – what data should be used; how should it be modelled and tested; and whether the outcomes derived from the data are correct.

Second, the observation that the introduction of AI/ML does not eliminate the role of human incentives in delivering good or bad outcomes, but transforms them, implies that boards should continue to focus on the oversight of human incentives and accountabilities within AI/ML-centric systems.

And third, the acceleration in the rate of introduction of AI/ML will create increased execution risks during the transition that need to be overseen. Boards should reflect on the range of skill sets and controls that are required to mitigate these risks both at senior level and throughout the organisation.

These seem to me directly transferable to the context of procurement governance and the design of strategies for the deployment of AI and machine learning, as well as other digital technologies.

First, it is necessary to create an enabling data architecture and to put significant thought into how to extract value from the increasingly available data. In that regard, there are two opportunities that should not be missed. One concerns the treatment of procurement datasets as high-value datasets for the purposes of the special regime of the Open Data Directive (for more details, see section 6 here), which will require careful consideration of the content and level of openness of procurement data in the context of the domestic transpositions that need to be in place by 17 July 2021. The other, related opportunity concerns the implementation of the new rules on eForms for procurement data publications, which Member States need to adopt by 14 November 2022. Building on the data architecture that will result from both sets of changes—which should be coordinated—will allow for the deployment of data analytics and machine learning techniques. The purposes and goals of such deployments also need to be considered carefully, as well as their potential implications.

Second, it seems clear that the changes in the management of procurement data and the quick development of analytics that can support procurement decision-making pile some additional training and upskilling needs on the already existing (and partially unaddressed?) current challenges of full consolidation of eProcurement across the EU. Moreover, it should be clear that there is no such thing as an objective and value neutral implementation of technological governance solutions and that all levels of accountability need to be provided with adequate data skills and digital literacy upgrades in order to check what is being done at the technical level (for crystal-clear discussion, see van der Voort et al, 'Rationality and politics of algorithms. Will the promise of big data survive the dynamics of public decision making?' (2019) 36(1) Government Information Quarterly 27-38). Otherwise, governance mechanism would be at risk of failure due to techno-capture and/or techno-blindness, whether intended or accidental.

Third, there is an increasing need to manage change and the risks that come with it. In a notoriously risk averse policy field such as procurement, this is no minor challenge. This should also prompt some rethinking of the way the procurement function is organised and its risk-management mechanisms.

Addressing these priorities will not be easy or cheap, but these are the fundamental building blocks required to enable the public procurement sector to benefit from the benefits of digital technologies as they mature. In consultancy jargon, these are the priorities to ‘future-proof’ procurement strategies. Will they be adopted?

Postscript

It is worth adding that, in particular the first and second issues, lend themselves to strong collaborations between policy-makers and academics. As rightly pointed out by Pencheva et al, 'Big Data and AI – A transformational shift for government: So, what next for research?' (2018) Public Policy and Administration, advanced access at 16:

... governments should also support the efforts for knowledge creation and analysis by opening up their data further, collaborating with – and actively seeking inputs from – researchers to understand how Big Data can be utilised in the public sector. Ultimately, the supporting field of academic thought will only be as strong as the public administration practice allows it to be.

'Experimental' WEF/UK Guidelines for AI Procurement: some comments

ⓒ Scott Richard, Liquid painting (2015).

ⓒ Scott Richard, Liquid painting (2015).

On 20 September 2019, and as part of its ‘Unlocking Public Sector Artificial Intelligence’ project, the World Economic Forum (WEF) published the White Paper Guidelines for AI Procurement (see also press release), with which it seeks to help governments accelerate efficiencies through responsible use of artificial intelligence and prepare for future risks. WEF indicated that over the next six months, governments around the world will test and pilot these guidelines (for now, there are indications of adoption in the UK, the United Arab Emirates and Colombia), and that further iterations will be published based on feedback learned on the ground.

Building on previous work on the Data Ethics Framework and the Guide to using AI in the Public Sector, the UK’s Office for Artificial Intelligence has decided to adopt its own draft version of the Guidelines for AI Procurement with substantially the same content, but with modified language and a narrower scope of some principles, in order to link them to the UK’s legislative and regulatory framework (and, in particular, the Data Ethics Framework). The UK will be the first country to trial the guidelines in pilot projects across several departments. The UK Government hopes that the new Guidelines for AI Procurement will help inform and empower buyers in the public sector, helping them to evaluate suppliers, then confidently and responsibly procure AI technologies for the benefit of citizens.

In this post, I offer some first thoughts about the Guidelines for AI Procurement, based on the WEF’s version, which is helpfully summarised in the table below.

Source: WEF, White Paper: ‘Guidelines for AI Procurement’ at 6.

Source: WEF, White Paper: ‘Guidelines for AI Procurement’ at 6.

Some Comments

Generally, it is worth being mindful that the ‘guidelines provide fundamental considerations that a government should address before acquiring and deploying AI solutions and services. They apply once it has been determined that the solution needed for a problem could be AI-driven’ (emphasis in original). As the UK’s version usefully stresses, many of the important decisions take place at the preparation and planning stages, before publishing a contract notice. Therefore, more than guidance for AI procurement, this is guidance on the design of a framework for the governance of innovative digital technologies procurement, including AI (but easily extendable to eg blockchain-based solutions), which will still require a second tier of (future/additional) guidance on the implementation of procurement procedures for the acquisition of AI-based solutions.

It is also worth stressing from the outset that the guidelines assume both the availability and a deep understanding by the contracting authority of the data that can be used to train and deploy the AI solutions, which is perhaps not fully reflective of the existing difficulties concerning the availability and quality of procurement data, and public sector data more generally [for discussion, see A Sanchez-Graells, 'Data-Driven and Digital Procurement Governance: Revisiting Two Well-Known Elephant Tales' (2019) Communications Law, forthcoming]. Where such knowledge is not readily available, it seems likely that the contracting authority may require the prior engagement of data consultants that could carry out an assessment of the data that is or could be available and its potential uses. This creates the need to roll-back some of the considerations included in the guidelines to that earlier stage, much along the lines of the issues concerning preliminary market consultations and the neutralisation of any advantages or conflicts of interest of undertakings involved in pre-tender discussions, which are also common issues with non-AI procurement of innovation. This can be rather tricky, in particular if there is a significant imbalance in expertise around data science and/or a shortfall in those skills in the contracting authority. Therefore, perhaps as a prior recommendation (or an expansion of guideline 7), it may be worth bearing in mind that the public sector needs to invest significant resources in hiring and retaining the necessary in-house capacities before engaging in the acquisition of complex (digital) technologies.

1. Use procurement processes that focus not on prescribing a specific solution, but rather on outlining problems and opportunities and allow room for iteration.

The fit of this recommendation with the existing regulation of procurement procedures seems to point towards either innovation partnerships (for new solutions) or dynamic purchasing systems (for existing or relatively off-the-shelf solutions). The reference to dynamic purchasing systems is slightly odd here, as solutions are unlikely to be susceptible of automatic deployment in any given context.

Moreover, this may not necessarily be the only possible approach under EU law and there seems to be significant scope to channel technology contests under the rules for design contests (Arts 78 and ff of Directive 2014/24/EU). The limited appetite of innovative start-ups for procurement systems that do not provide them with ‘market exposure’ (such as large framework agreements, but likely also dynamic purchasing systems) may be relevant, depending on market conditions (see eg PUBLIC, Buying into the Future. How to Deliver Innovation through Public Procurement (2019) 23). This could create opportunities for broader calls for technological innovation, perhaps as a phase prior to conducting a more structured (and expensive) procurement procedure for an innovation partnership.

All in all, it would seem like—at least at UK level, or in any other jurisdictions seeking to pilot the guidance—it could be advisable to design a standard procurement procedure for AI-related market engagement, in order to avoid having each willing contracting authority having to reinvent the wheel.

2. Define the public benefit of using AI while assessing risks.

Like with many other aspects of the guidelines, one of the difficulties here is to try to establish actionable measures to deal with ‘unknown unknowns’ that may emerge only in the implementation phase, or well into the deployment of the solution. It would be naive to assume that the contracting authority—or the potential tenderers—can anticipate all possible risks and design adequate mitigating strategies. It would thus perhaps be wise to recommend the use of AI solutions for public sector / public service use cases that have a limited impact on individual rights, as a way to gain much necessary expertise and know-how before proceeding to deployment in more sensitive areas.

Moreover, this is perhaps the recommendation that is more difficult to instrument in procurement terms (under the EU rules), as the consideration of ‘public benefit’ seems to be a matter for the contracting authority’s sole assessment, which could eventually lead to a cancellation—with or without retendering—of the procurement. It is difficult to see how to design evaluation tools (in terms of both technical specifications and award criteria) capable of capturing the insight that ‘public benefit extends beyond value for money and also includes considerations about transparency of the decision-making process and other factors that are included in these guidelines’. This should thus likely be built into the procurement process through opportunities for the contracting authority to discontinue the project (with no or limited compensation), which also points towards the structure of the innovation partnership as the regulated procedure most likely to fit.

3. Aim to include your procurement within a strategy for AI adoption across government and learn from others.

This is mainly aimed at ensuring cross-sharing of experiences and at concentrating the need for specific AI-based solutions, which makes sense. The difficulty will be in the practical implementation of this in a quickly-changing setting, which could be facilitated by the creation of a mandatory (not necessarily public) centralised register of AI-based projects, as well as the consideration of the creation and mandatory involvement of a specialised administrative unit. This would be linked to the general comment on the need to invest in skills, but could alleviate the financial impact by making the resources available across Government rather than having each contracting authority create its own expert team.

4. Ensure that legislation and codes of practice are incorporated in the RFP.

Both aspects of this guideline are problematic to a lawyer’s eyes. It is not a matter of legal imperialism to simply consider that there have to be more general mechanisms to ensure that procurement procedures (not only for digital technologies) are fully legally compliant.

The recommendation to carry out a comprehensive review of the legal system to identify all applicable rules and then ‘Incorporate those rules and norms into the RFP by referring to the originating laws and regulations’ does not make a lot of sense, since the inclusion or not in the RFP does not affect the enforceability of those rules, and given the practical impossibility for a contracting authority to assess the entirety of rules applicable to different tenderers, in particular if they are based in other jurisdictions. It would also create all sorts of problems in terms of potential claims of legitimate expectations by tenderers. Moreover, under EU law, there is case law (such as Pizzo and Connexxion Taxi Services) that creates conflicting incentives for the inclusion of specific references to rules and their interpretation in tender documents.

The recommendation on balancing trade secret protection and public interest, including data privacy compliance, is just insufficient and falls well short of the challenge of addressing these complex issues. The tension between general duties of administrative law and the opacity of algorithms (in particular where they are protected by IP or trade secrets protections) is one of the most heated ongoing debates in legal and governance scholarship. It also obviates the need to distinguish between the different rules applicable to the data and to the algorithms, as well as the paramount relevance of the General Data Protection Regulation in this context (at least where EU data is concerned).

5. Articulate the technical feasibility and governance considerations of obtaining relevant data.

This is, in my view, the strongest part of the guidelines. The stress on the need to ensure access to data as a pre-requisite for any AI project and the emphasis and detail put in the design of the relevant data governance structure ahead of the procurement could not be clearer. The difficulty, however, will be in getting most contracting authorities to this level of data-readiness. As mentioned above, the guidelines assume a level of competence that seems too advanced for most contracting authorities potentially interested in carrying out AI-based projects, or that could benefit from them.

6. Highlight the technical and ethical limitations of using the data to avoid issues such as bias.

This guideline is also premised on advanced knowledge and understanding of the data by the contracting authority, and thus creates the same challenges (as further discussed below).

7. Work with a diverse, multidisciplinary team.

Once again, this will be expensive and create some organisational challenges (as also discussed below).

8. Focus throughout the procurement process on mechanisms of accountability and transparency norms.

This is another rather naive and limited aspect of the guidelines, in particular the final point that ‘If an algorithm will be making decisions that affect people’s rights and public benefits, describe how the administrative process would preserve due process by enabling the contestability of automated decision-making in those circumstances.' This is another of the hotly-debated issues surrounding the deployment of AI in the public sector and it seems unlikely that a contracting authority will be able to provide the necessary answers to issues that are yet to be determined—eg the difficult interpretive issues surrounding solely automated processing of personal data under the General Data Protection Regulation, as discussed in eg M Finck, ‘Automated Decision-Making and Administrative Law’ (2019) Max Planck Institute for Innovation and Competition Research Paper No. 19-10.

9. Implement a process for the continued engagement of the AI provider with the acquiring entity for knowledge transfer and long-term risk assessment.

This is another area of general strength in the guidelines, which under EU procurement law should be channeled through stringent contract performance conditions (Art 70 Directive 2014/24/EU) or, perhaps even better, by creating secondary regulation on mandatory on-going support and knowledge transfer for all AI-based implementations in the public sector.

The only aspect of this guideline that is problematic concerns the mention that, in relation to ethical considerations, ‘Bidders should be able not only to describe their approach to the above, but also to provide examples of projects, complete with client references, where these considerations have been followed.’ This would clearly be a problem for new entrants, as well as generate rather significant first-mover advantages for undertakings with prior experience (likely in the private sector). In my view, this should be removed from the guidelines.

10. Create the conditions for a level and fair playing field among AI solution providers.

This section includes significant challenges concerning issues related to the ownership of IP on AI-based solutions. Most of the recommendations seem rather complicated to implement in practice, such as the reference to the need to ‘Consider strategies to avoid vendor lock-in, particularly in relation to black-box algorithms. These practices could involve the use of open standards, royalty-free licensing and public domain publication terms’, or to ‘'consider whether [the] department should own that IP and how it would control it [in particular in the context of evolution or new design of the algorithms]. The arrangements should be mutually beneficial and fair, and require royalty-free licensing when adopting a system that includes IP controlled by a vendor’. These are also extremely complex and debated issues and, once again, it seems unlikely that a contracting authority will be able to provide all relevant answers.

Overall assessment

The main strength of the guidelines lies in its recommendations concerning the evaluation of data availability and quality, as well as the need to create robust data governance frameworks and the need to have a deep insight into data limitations and biases (guidelines 5 and 6). There are also some useful, although rather self-explanatory reminders of basic planning issues concerning the need to ensure the relevant skillset and the unavoidable multidisciplinarity of teams working in AI (guidelines 3 and 7). Similarly, the guidelines provide some very high-level indications on how to structure the procurement process (guidelines 1, 2 and 9), which will however require much more detailed (future/additional) guidance before they can be implemented by a contracting authority.

However, in all other aspects, the guidelines work as an issue-spotting instrument rather than as a guidance tool. This is clearly the case concerning the tensions between data privacy, good administration and proprietary protection of the IP and trade secrets underlying AI-based solutions (guidelines 4, 8 and 10). In my view, rather than taking the naive—and potentially misleading—approach of indicating the issues that contracting authorities need to address (in the RFP, or elsewhere) as if they were currently (easily, or at all) addressable at that level of administrative practice, the guidelines should provide sufficiently precise and goal-oriented recommendations on how to do so if they are to be useful. This is not an easy task and much more work seems necessary before the document can provide useful support to contracting authorities seeking to implement procedures for the procurement of AI-based solutions. I thus wonder how much learning can the guidelines generate in the pilots to be conducted in the UK and elsewhere. For now, I would recommend other governments to wait and see before ‘adopting’ the guidelines or treating them as a useful policy tool, in particular if that discouraged them from carrying out their own efforts in developing actionable guidance on how to procure AI-based solutions.

Finally, it does not take much reading between the lines to realise that the challenges of developing an enabling data architecture and upskilling the public sector (not solely the procurement workforce, and perhaps through specialised units, as a first step) so that it is able to identify the potential for AI-based solutions and to adequately govern their design and implementation remain as very likely stumbling blocks in the road towards deployment of public sector AI. In that regard, general initiatives concerning the availability of quality procurement data and the necessary reform of public procurement teams to fill the data science and programming gaps that currently exist should remain the priority—at least in the EU, as discussed in A Sanchez-Graells, EU Public Procurement Policy and the Fourth Industrial Revolution: Pushing and Pulling as One? (2019) SSRN working paper, and in idem, 'Some public procurement challenges in supporting and delivering smart urban mobility: procurement data, discretion and expertise', in M Finck, M Lamping, V Moscon & H Richter (eds), Smart Urban Mobility – Law, Regulation, and Policy, MPI Studies on Intellectual Property and Competition Law (Berlin, Springer, 2020) forthcoming.