The Road to AutoDeclare.
In 2022 advancements in natural language processing (NLP) resulted in the use of large language models (LLMs). There are several LLMs available from major developers like Google and OpenAI e.g. ChatGPT.
With these models it is possible to generate everything from simple essays to code e.g. Open AI Codex (that powers Microsoft's GitHub 'Copilot', 'Tabnine', Google’s 'T5', Carnegie Mellon University's 'Polycoder' and 'Cogram'.
GPTs offer significant potential to improve the trustworthiness of AI.
For example researchers (e.g. Constitutional.ai) are now working on GPTs that can take organisation value statements. Like those of CarefulAI:
- 'AI that is validated for people, by people, will be trusted and therefore have an effect more quickly'
- 'Users should set targets for accuracy, sensitivity and specificity, these should take priority over academic and industrial priorities.'
- 'AI needs to be trusted, so the motives behind its design, and its fitness for purpose should be made transparent, and privacy protected.'
- ‘AI is only as good as it can be proven to be at any point in time’
to build AI systems that can embody and work to these values.
But whatever comes out of these approaches, one thing will remain true.
Humans and regulators will still want to validate their compliance with best practice around trustworthy AI.
As a consequence, established AI trustworthiness frameworks like PRIDAR from CarefulAI, and new ones like BSI30440, ALTAI, and Plot 4ai, along with guidance from governments and regulators, will still be used in AI compliance auditing.
These place a significant burden on innovation: as auditing against standards and frameworks is led by human subject matter experts, and alas there are not enough subject matter experts to meet the demand, despite it being a $2.8 billion industry. The backlog to start auditing to frameworks has been quoted at two years in some industries e.g. the software and a medical device sector.
But there is some good news....
As a leader in the validation of AI trustworthiness, CarefulAI discovered that between 40% to 48% of the auditable evidence of an AI systems trustworthiness exists in an AI supplier's codebase and system design.
Normally one needs to be an expert in Ml/AI to understand it: because it is not well described (commented on) in the code base. If it were, and defined meta labels associated with Trustworthiness added (e.g. 'bias test'), such code could be automatically read and associated with declarations of trustworthiness. Also, models of trustworthiness could be be built in different sectors based on the frequency of meta labels in sector code.
This could speed up the process of declaring accreditation to trustworthiness frameworks by 40%.
Realising this, the idea of AutoDeclare was born:
The automatic annotation of AI code and systems with trustworthiness 'meta labels' that can be audited.
If you are interested to work with or observe: CarefulAI, researchers, regulators, developers and AI platform leads as they work to bring the principles of AutoDeclare to market
Get in Touch