-
5. Reliability and safety
5.1 Data suitability
The data used to operate, train and validate your AI system has a significant impact on its performance, fairness and safety. In your answer to this question, explain why the chosen data is suitable for your use case. Some relevant considerations are outlined below.
When choosing between datasets, consider whether the data can be disaggregated by marginalised groups, particularly by Indigeneity. If the data is Indigenous data, you should refer to the guidelines in the Framework for Governance of Indigenous Data (see section 5.2 below).
Data quality should be assessed prior to use in AI systems. Agencies should select applicable metrics to determine a data set’s quality and identify any remediation required before using it for training or validation in AI systems. Suggested relevant metrics to consider include relevance, accuracy, completeness, timeliness, validity and lack of duplication. One method to ensure good quality data is to set minimum thresholds appropriate to specific use cases, such as through acceptance criteria discussed below at 5.4. An example of a specific framework for determining data quality in statistical uses is the ABS Data Quality Framework.
Where third party material or data is being used to operate, train or validate an AI system, agencies should assess the data and the AI system for copyright concerns due to the potential for copying or transforming material that is protected by copyright or broader intellectual property laws.
You should also consider:
Data provenance
Involves creating an audit trail to assign custody and trace accountability for issues. It provides assurance of the chain of custody and its reliability, insofar as origins of the data are documented.
Data lineage
Involves documenting data origins and flows to enable stakeholders to better understand how datasets are constructed and processed. This fosters transparency and trust in AI systems.
Data volume
Consider the volume of data you need to support the operation, training and validation of your AI system.
5.2 Indigenous data
Describe how any components of your AI system have used or will use Indigenous data, or where any outputs relate to Indigenous individuals, communities or groups.
All Australian Public Service (APS) agencies are required to implement the Framework for Governance of Indigenous Data (GID). The GID adopts the definition of ‘Indigenous data’ as provided by Maiam nayri Wingara Indigenous Data Sovereignty Collective:
Information or knowledge, in any format or medium, which is about and may affect Indigenous peoples both collectively and individually.
-
If the data used to operate, train or validate your AI system, or any outputs from your AI system, are Indigenous data in line with the Maiam nayri Wingara definition above, you should refer to the guidelines in the GID.
This includes applying the principles of respect for cultural heritage, informed consent, privacy (including collective or group privacy) and trust, to all stages of the ‘Data Lifecycle’. These concepts, including the FAIR (Findable, Accessible, Interoperable, and Reusable) and CARE (Collective Benefit, Authority to Control, Responsibility, Ethics) principles, are described in the GID.
Relevant practices to consider in this context include:
- Checking if datasets used to train the AI included diverse and representative samples of cultural expression, artifacts, languages and practices. This supports the AI system being able to recognise and appropriately respond to a greater range of cultural contexts in a less biased manner.
- Describing any mechanisms in place for engaging with Indigenous individuals, communities or group representatives and collecting and incorporating their feedback on the AI system’s performance, especially regarding cultural aspects.
- Describing processes to review documentation and protocols that ensure the project has incorporated the GID principles. Look for evidence of meaningful engagement with and input from suitably qualified and experienced Indigenous individuals, communities and groups. Assess if the system includes features or options that allow Indigenous stakeholders to control how their data is used and represented and describe how benefits of the project to First Nations Peoples, to which the data relate, have been considered.
5.3 Suitability of procured AI model
Also consider the use of Indigenous data in the context of the United Nations Declaration on the Rights of Indigenous Peoples and apply the concept of ‘free, prior and informed consent’ in relation to the use of Indigenous data in AI systems.
If you are procuring an AI model (or system) from a third‑party provider, your procurement process should consider whether the provider has appropriate data management (including data quality and data provenance), governance, data sourcing, privacy, security, intellectual property, and cybersecurity practices in relation to the model. This will help you to identify whether the AI model is fit for the context and purpose of your AI use case.
The relevance of the data used in training the AI model may influence the output and may not be relevant to the use case (and Australian context). Consider whether the model is likely to make accurate or reliable predictions concerning matters relating to Australian subject matter if it has been trained on, for example, US‑centric data.
In addition, there are a number of other considerations you should take into account when selecting a procured AI model. The following considerations may be relevant to your use case.
- Does the AI model meet the functional requirements needed for your use case?
- How was the model evaluated? What test data and benchmarks were used?
- How is versioning for the AI model handled?
- What support does the provider provide for users/procurers?
- What provisions apply regarding potential liability issues? If the product fails, is accountability clear between your agency and the provider?
- What security precautions have been taken? What residual risks remain and how are these being mitigated?
- Are there any guarantees that data handling and management (for the entire lifecycle of the data) for the procured model meet internal agency and legislative requirements? What guarantees are there regarding the robustness of the model?
- What measures have been taken to prevent or reduce hallucinations, unwanted bias and model drift?
- Is the explainability and interpretability of the model sufficient for your use case?
- What computing and storage capacities are necessary for operating the model on‑premises?
- What capability is needed to maintain the AI model? Can this be done in‑house, or will this need to be sourced externally?
- If you are considering using a platform as a service (PaaS) to run and support your AI system or AI model, have you considered risks associated with outsourcing?
5.4 Testing
Consider also how your agency will support transparency across the AI supply chain, for example, by notifying the developer of issues encountered in using the model or system.
Testing is a key element for assuring the responsible and safe use of AI models – for both models developed in-house and externally procured – and in turn, of AI systems. Rigorous testing helps validate that the system performs as intended across diverse scenarios. Thorough and effective testing helps identify problems before deployment.
Testing AI systems against test datasets can reveal biases or possible unintended consequences or issues before real-world deployment. Testing on data that is limited or skewed can fail to reveal shortcomings.
Consider establishing clear and measurable acceptance criteria for the AI system that, if met, would be expected to control harms that are relevant in the context of your AI use case. Acceptance criteria should be specific, objective and verifiable. They are meant to specify the conditions under which a potential harm is adequately controlled.
Consider developing a test plan for the acceptance criteria to outline the proposed testing methods, tools and metrics. Documenting results through a test report will assist with demonstrating accountability and transparency. A test report could include the following:
- a summary of the testing objectives, methods and metrics used
- results for each test case
- an analysis of the root causes of any identified issues or failures
- recommendations for remediation or improvement, and whether the improvements should be done before deployment or as a future release.
In your explanation, outline any areas of concern in results from testing. If you have not started testing, outline elements to be considered in testing plans.
Model accuracy
As an example. model accuracy is a key metric for evaluating the performance of an AI system. Accuracy should be considered in the specific context of the AI use case, as the consequences of errors or inaccuracies can vary significantly depending on the domain and application.
Some of the factors that can influence AI model output accuracy and reliability include:
- choice of AI model or model architecture
- quality, accuracy and representativeness of training data
- presence of bias in the training data or AI model
- robustness to noise, outliers and edge cases
- ability of the AI model to generalise to new data
- potential for errors or ‘hallucinations’ in outputs
- environmental factors (such as lighting conditions for computer vision systems)
- adversarial attacks (such as malicious actors manipulating input data to affect outputs)
- stability and consistency of performance over time.
Ways to assess and validate the accuracy of your model for your AI use case include:
- quantitative metrics
- qualitative analysis (for example, manual review of output, error analysis, user feedback)
- domain-specific benchmarks or performance standards
- comparison to human performance or alternative models.
5.5 Pilot
It is important to set accuracy targets that are appropriate for the risk and context of the use case. For high stakes decisions, you should aim for a very high level of accuracy and have clear processes for handling uncertain or borderline cases.
Conducting a pilot study is a valuable way to assess the real-world performance and impact of your AI use before full deployment. A well-designed pilot can surface issues related to reliability, safety, fairness and usability that may not be apparent in a controlled development environment.
If you are planning a pilot, your explanation should provide a brief overview of the pilot's:
- scope and duration
- objectives and key results (OKRs)
- key performance indicators (KPIs)
- participant selection and consent process
- risk mitigation strategies.
5.6 Monitoring
If you have already completed a pilot, reflect on the key findings and lessons learned. How did the pilot outcomes compare to your expectations? What issues or surprises emerged? How did you adapt your AI use case based on the pilot results?
If you are not planning to conduct a pilot, explain why not. Consider whether the scale, risk or novelty of your use case warrants a pilot phase. Discuss alternative approaches you are taking to validate the performance of your AI use case and gather user feedback prior to full deployment.
Monitoring is key to maintaining the reliability and safety of AI systems over time. It enables active rather than passive oversight and governance.
Your monitoring plan should be tailored to the specific risks and requirements of your use case. In your explanation, describe your approach to monitoring any measurable acceptance criteria (as discussed above at 5.4) as well as other relevant metrics such as performance metrics or anomaly detection. In your plan, you should include your proposed monitoring intervals for your use case. Consider including procedures for reporting and learning from incidents. You may wish to refer to the OECD paper on Defining AI incidents and related terms.
Periodically evaluate your monitoring and evaluation mechanisms to ensure they remain effective and aligned with evolving conditions throughout the lifecycle of your AI use case. Examples of events that could influence your monitoring plan are system upgrades, error reports, changes in input data, performance deviation or feedback from stakeholders.
Monitoring can help identify issues that can impact the safety and reliability of your AI system, such as concept or data drift.
- Concept drift refers to a change in the relationship between input data and the feature being predicted
- Data drift refers to a change in input data patterns compared to the data used to train the model
Vendors offer monitoring tools that may be worth considering for your use case. For more information, see pp. 26-27 of the NAIC’s Implementing Australia’s AI Ethics Principles report.
5.7 Preparedness to intervene or disengage
Relevant stakeholders, including those who operate, use or interact with the AI system, those who monitor AI system performance, and affected stakeholders identified at section 2.4, should have the ability to raise concerns about insights or decisions informed by the AI system.
Agencies should develop clear escalation processes for raising concerns, such as designated points of contact, guidelines and criteria for when human intervention is necessary and timelines for response and resolution. Agencies should also consider documenting and reviewing any interventions that occur to ensure consistency and fairness.
In addition, agencies should be prepared to quickly and safely disengage an AI system when an unresolvable issue is identified. This could include a data breach, unauthorised access or system compromise. Consider such scenarios in business continuity, data breach and security response plans.
Agencies should consider the techniques below to avoid overreliance on AI system outputs.
System design stage
Build in transparency about system limitations
Incorporate prompts to remind users to critically analyse outputs, such as explanations of outputs, hallucination reminders, and accuracy scores.
Build in 2-way feedback pathways
Prompt users to assess the quality of the AI system’s outputs and provide feedback.
Similarly, provide feedback to users on their interactions with the systems (e.g. feedback on ineffective prompts, alerts when the user has accepted a risky decision).
Prompt human decisions
Consider designing your AI system to provide options for the user to choose from, rather than a single solution, to encourage user engagement with AI outputs.
Evaluation stage
Ensure regular evaluation
Involve users in regular evaluations of your AI system. Encourage users to assess the effectiveness of the AI system and identify areas for improvement.
-
6. Privacy protection and security
6.1 Minimise and protect personal information
Data minimisation
Data minimisation is an important consideration when developing and deploying AI systems for several reasons, including privacy and improving quality and model stability. In some cases, more data may be warranted (for example, some large language models) but it is important that you follow good practice in determining the data needed for your use case.
Privacy requirements for personal information under the Australian Privacy Principles (APPs) are an important consideration in responding to this question. Ensure you have considered your obligations under the APPs, particularly APPs 3, 6 and 11.
For more information, you should consult the APP guidelines, your agency’s internal privacy policy and resources and privacy officer.
Privacy enhancing technologies
Your agency may want or need to use privacy enhancing technologies to assist in de‑identifying personal information under the APPs or as a risk mitigation/trust building approach. Under the Privacy Act 1988 (Cth) and the APPs, where information has been appropriately de‑identified it is no longer personal information and can be used in ways that the Privacy Act would normally restrict.
The Office of the Australian Information Commissioner’s (OAIC) website provides detailed guidance on De-identification and the Privacy Act that agencies should consider. You may also wish to refer to the De-identification Decision-Making Framework, jointly developed by the OAIC and CSIRO Data61.
6.2 Privacy assessment
The Australian Government Agencies Privacy Code (the Privacy Code) requires Australian Government agencies subject to the Privacy Act 1988 to conduct a privacy impact assessment (PIA) for all ‘high privacy risk projects’. A project may be a high privacy risk if the agency reasonably considers that the project involves new or changed ways of handling personal information that are likely to have a significant impact on the privacy of individuals.
A Privacy Threshold Assessment (PTA) is a preliminary assessment to help you determine your project’s potential privacy impacts and give you a sense of the risk level, including whether it could be a ‘high privacy risk project’ requiring a PIA under the Code.
This assurance framework does not determine the timing for conducting a PIA or PTA – it may be appropriate that you conduct a PIA or PTA earlier than your assessment of the AI use case under this framework.
If no PIA or PTA has been undertaken, explain why and what consideration there has been of potential privacy impacts.
Privacy assessments should consider if relevant individuals have provided informed consent, where required, to the collection, sharing and use of their personal information in the AI system’s training, operation or as an output for making inferences. Also consider how any consent obtained, including a description of processes used to obtain the consent, has been recorded.
For more information, you should consult the guidance on the Office of the Australian Information Commissioner’s website. You can also consult your agency’s privacy officer and internal privacy policy and resources.
If your AI system has used or will use Indigenous data, you should also consider whether notions of ‘collective’ or ‘group’ privacy of First Nations people are relevant and refer to the guidelines in the Framework for Governance of Indigenous Data (see 5.2).
6.3 Authority to operate
The Protective Security Policy Framework (PSPF) applies to non‑corporate Commonwealth entities subject to the Public Governance, Performance and Accountability Act 2013 (PGPA Act).
Refer to the relevant sections of the PSPF on safeguarding information and communication technology (ICT) systems to support the secure and continuous delivery of government business.
Under the PSPF, entities must effectively implement the Australian Government Information Security Manual (ISM) security principles and must only use ICT systems that the determining authority (or their delegate) has authorised to operate based on the acceptance of the residual security risks associated with its operation.
In addition, the Australian Signals Directorate’s Engaging with Artificial Intelligence guidance outlines mitigation considerations for organisations to consider. It is highly recommended that your agency engages with and implements the mitigation considerations in the guidance.
AI systems that have already been authorised or fall within existing authorisations by your agency’s IT Security Adviser (ITSA) do not have to be re‑authorised.
It is recommended you engage with your agency’s ITSA early to ensure all PSPF and ISM requirements are fulfilled.
-
7. Transparency and explainability
7.1 Consultation
You should consult with a diverse range of internal and external stakeholders at every stage of your AI system’s deployment to help identify potential biases, privacy concerns, and other ethical and legal issues present in your AI use case. This process can also help foster transparency, accountability, and trust with your stakeholders and can help improve their understanding of the technology’s benefits and limitations. Refer to the stakeholders you identified in section 2.4.
If your project has the potential to significantly impact Aboriginal and Torres Strait Islander peoples or communities, it is critical that you meaningfully consult with relevant community representatives.
Consultation resources
APS Framework for Engagement and Participation – sets principles and standards that underpin effective APS engagement with citizens, community and business and includes practical guidance on engagement methods.
Office of Impact Analysis Best Practice Consultation guidance note – provides a detailed explanation of the application of the whole-of-government consultation principles outlined in the Australian Government Guide to Policy Impact Analysis.
AIATSIS Principles for engagement in projects concerning Aboriginal and Torres Strait Islander peoples – provides non-Indigenous policy makers and service designers with the foundational principles for meaningfully engaging with Aboriginal and Torres Strait Islander peoples on projects that impact their communities.
7.2 Public visibility
Where appropriate, you should make the scope and goals of your AI use case publicly available. You should consider publishing relevant, accessible information about your AI use case in a centralised location on your agency website. This information could include:
- use case purpose
- overview of model and application
- benefits
- risks and mitigations
- training data sources
- compliance with the Policy for the responsible use of AI in government
- contact officer information.
Note: All agencies in scope of the Policy for the responsible use of AI in in government are required to publish an AI transparency statement. More information on this requirement can be found in the policy and associated guidance. You may wish to include information about your use case in your agency’s AI transparency statement.
Considerations for publishing
In some circumstances it may not be appropriate to publish detailed information about your AI use case. When deciding whether to publish this information you should balance the public benefits of AI transparency with the potential risks as well as compatibility with any legal requirements around publication.
For example, you may choose to limit the amount of information you publish or not publish any information at all if:
- the AI use case is still in the experimentation phase
- publishing may have negative implications for national security
- publishing may have negative implications for criminal intelligence activities
- publishing may significantly increase the risk of fraud or non-compliance
- publishing may significantly increase the risk of cybersecurity threats
- publishing may jeopardise commercial competitiveness.
7.3 Maintain appropriate documentation and records
You may also wish to refer to the exemptions under the Freedom of Information Act 1982 in considering whether it is appropriate to publish information about your AI use case.
Agencies should comply with legislation, policies and standards for maintaining reliable and auditable records of decisions, testing, and the information and data assets used in an AI system. This will enable internal and external scrutiny, continuity of knowledge and accountability. This will also support transparency across the AI supply chain – for example, this documentation may be useful to any downstream users of AI models or systems developed by your agency.
Agencies should document AI technologies they are using to perform government functions as well as essential information about AI models, their versions, creators and owners. In addition, artifacts used and produced by AI – such as prompts, inputs and raw outputs – may constitute Commonwealth records under the Archives Act 1983 and may need to be kept for certain periods of time identified in records authorities issued by the National Archives of Australia (NAA).
To identify their legal obligations, business areas implementing AI in agencies may want to consult with their information and records management teams. The NAA can also provide advice on how to manage data and records produced by different AI use cases.
The NAA Information Management Standard for Australian Government outlines principles and expectations for the creation and management of government business information. Further guidance relating to AI records is available on the NAA website under Information Management for Current, Emerging and Critical Technologies.
AI documentation types
Where suitable, you should consider creating the following forms of documentation for any AI system you build. If you are procuring an AI system from an external provider, it may be appropriate to request these documents as part of your tender process.
System factsheet/model card
A system factsheet (sometimes called a model card) is a short document designed to provide an overview of an AI system to non-technical audiences (such as users, members of the public, procurers, and auditors). These factsheets usually include information about the AI system’s purpose, intended use, limitations, training data, and performance against key metrics.
Examples of system factsheets include Google Cloud Model Cards and IBM AI factsheets.
Datasheets
Datasheets are documents completed by dataset creators to provide an overview of the data used to train and evaluate an AI system. Datasheets provide key information about the dataset including its contents, data owners, composition, intended uses, sensitivities, provenance, labelling and representativeness.
Examples of datasheets include Google’s AI data cards and Microsoft’s Aether Data Documentation template.
System decision registries
System decision registries record key decisions made during the development and deployment of an AI system. These registries contain information about what decisions were made, when they were made, who made them and why they were made (the decision rationale).
Examples of decision registries include Atlassian’s DACI decision documentation template and Microsoft’s Design Decision Log.
Documentation in relation to reliability and safety
It is also best practice to maintain documentation on testing, piloting and monitoring and evaluation of your AI system and use case, in line with the practices outlined in section 5.
See Implementing Australia’s AI Ethics Principles for more on AI documentation.
7.4 Disclosing AI interactions and outputs
You should design your use case to inform people (including members of the public, APS staff and decision-makers) that that they are interacting with an AI system or are being exposed to content that has been generated by AI.
When to disclose use of AI
You should ensure that you disclose when a user is directly interacting with an AI system, especially:
- when AI plays a significant role in critical decision-making processes
- when AI has potential to influence opinions, beliefs or perceptions
- where there is a legal requirement regarding AI disclosure
- where AI is used to generate recommendations for content, products or services.
You should ensure that you disclose when someone is being exposed to AI-generated content and:
- any of the content has not been through a contextually appropriate degree of fact checking and editorial review by a human with the appropriate skills, knowledge or experience in the relevant subject matter
- the content purports to portray real people, places or events or could be misinterpreted that way
- the intended audience for the content would reasonably expect disclosure.
Exercise judgment and consider the level of disclosure that the intended audience would expect, including where AI-generated content has been through rigorous fact-checking and editorial review. Err on the side of greater disclosure – norms around appropriate disclosure will continue to develop as AI-generated content becomes more ubiquitous.
Mechanisms for disclosure of AI interactions:
When designing or procuring an AI system, you should consider the most appropriate mechanism(s) for disclosing AI interactions. Some examples are outlined below:
Verbal or written disclosures
Verbal or written disclosures are statements that are heard by or shown to users to inform that they are interacting with (or will be interacting with) an AI system.
For example, disclaimers, warnings, specific clauses in privacy policy and/or terms of use, content labels, visible watermarks, by-lines, physical signage, communication campaigns.
Behavioural disclosures
Behavioural disclosure refers to the use stylistic indicators that help users to identify that they are engaging with AI-generated content. These indicators should generally be used in combination with other forms of disclosure.
For example, using clearly synthetic voices or formal, structured language, robotic avatars.
Technical disclosures
Technical disclosures are machine-readable identifiers for AI‑generated content.
For example, inclusion in metadata, technical watermarks, cryptographic signatures.
Agencies should consider using AI systems that use industry-standard provenance technologies, such as those aligned with the standard developed by the Coalition for Content Provenance (C2PA).
7.5 Offer appropriate explanations
Explainability refers to accurately and effectively conveying an AI system’s decision process to a stakeholder, even if they don’t fully understand the specifics of how the model works. Explainability facilitates transparency, independent expert scrutiny and access to justice.
You should be able to clearly explain how a government decision or outcome has been made or informed by AI to a range of technical and non-technical audiences. You should also be aware of any requirements in legislation to provide reasons for decisions, both generally and in relation to the particular class of decisions that you are seeking to make using AI.
Explanations may apply globally (how a model broadly works) or locally (why the model has come to a specific decision). You should determine which is more appropriate for your audience.
Principles for providing effective explanations
Contrastive
Outline why the AI system output one outcome instead of another outcome.
Selective
Focus on the most-relevant factors contributing to the AI system’s decision process.
Consistent with the audience’s understanding
Align with the audience’s level of technical (or non-technical) background.
Generalisation to similar cases
Generalise to similar cases to help the audience predict what the AI system will do.
You may wish to refer to Interpretable Machine Learning: A Guide for Making Black Box Models Explainable for further advice and examples.
Tools for explaining non-interpretable models
While explanations for interpretable models (i.e. low complexity with clear parameters) are relatively straightforward, in practice most AI systems have low interpretability and require effective post-hoc explanations that strike a balance between accuracy and simplicity. Among other matters, agencies should also consider what are appropriate timeframes for explanations to be provided in the context of their use case.
Below are some tools or approaches that can assist with developing explanations; however, explainable AI algorithms are not the only solution to improve system explainability (for example, designing effective explanation interfaces).
Local explanations
- Feature-importance analysis (e.g. random forest feature permutation analysis, saliency maps, feature reconstructions, individual condition expectation (ICE) plots)
- Partial dependence plots (PDPs)
- Shapley values
Global explanations
Example based
Contrastive, counterfactual, data explorers/visualisation.
Model-agnostic methods
Feature-importance methods
Specifically for neural-network interpretations
Specifically for deep learning in cloud environments
Advice on appropriate explanations is available in the NAIC’s Implementing Australia’s AI Ethics Principles report.
-
8. Contestability
8.1 Notification of AI affecting rights
You should notify individuals, groups, communities or businesses when an administrative action materially influenced by an AI system has a legal or similarly significant effect on them. This notification should state that the action was materially influenced by an AI system and include information on available review rights and whether and how the individual can challenge the action.
An action producing a legal effect is when an individual, group, community or business’s legal status or rights are affected, and includes:
- provision of benefits granted by legislation
- contractual rights.
An action producing a similarly significant effect is when an individual, group, community or business’s circumstances, behaviours or choices are affected, and includes:
- denial of consequential services or support, such as housing, insurance, education enrolment, criminal justice, employment opportunities and health care services
- provision of basic necessities, such as food and water.
A decision may be considered to have been materially influenced by an AI system if:
- the decision was automated by an AI system, with little to no human oversight
- a component of the decision was automated by an AI system, with little to no human oversight (for example, a computer makes the first 2 limbs of a decision, with the final limb made by a human)
- the AI system is likely to influence decisions that are made (for example, the output of the AI system recommended a decision to a human for consideration or provided substantive analysis to inform a decision).
‘Administrative action’ is any of the following:
- making, refusing or failing to make a decision
- exercising, refusing or failing to exercise a power
- performing, refusing or failing to perform a function or duty.
Note: this guidance is designed to supplement, not replace, existing administrative law requirements pertaining to notification of administrative decisions. The Attorney-General’s Department is leading work to develop a consistent legislative framework for automated decision making (ADM), as part of the government’s response to recommendation 17.1 of the Robodebt Royal Commission Report. The Australian Government AI assurance framework will continue to evolve to ensure alignment as this work progresses.
8.2 Challenging administrative actions influenced by AI
Individuals, groups, communities or businesses subject to an administrative action materially influenced by an AI system that has a legal or similarly significant effect on them should be provided with an opportunity to challenge this action. This is an important administrative law principle. See guidance on section 8.1 above for assistance interpreting terminology.
Administrative actions may be subject to both merits review and judicial review. Merits review considers whether a decision made was the correct or preferable one in the circumstances, and includes internal review conducted by the agency and external review processes. Judicial review examines whether a decision was legally correct.
You should ensure that review rights that ordinarily apply to human-made decisions or actions are not impacted or limited because an AI system has been used.
Notifications discussed at section 8.1 should include information about available review mechanisms so that people can make informed decisions about disputing administrative actions.
You will need to ensure a person within your agency is able to answer questions in a court or tribunal about an administrative action taken by an AI system if that matter is ultimately challenged. Review mechanisms also impact on the obligation to provide reasons. For example, the Administrative Decisions (Judicial Review) Act 1977 gives applicants a right to reasons for administrative decisions.
-
9. Accountability
9.1 Establishing responsibilities
Establishing clear roles and responsibilities is essential for ensuring accountability in the development and use of AI systems. In this section, you are asked to identify the individuals responsible for 3 key aspects of your AI system:
Use of AI insights and decisions
The person responsible for the application of the AI system’s outputs, including making decisions or taking actions based on those outputs.
Monitoring the performance of the AI system
The person responsible for overseeing the ongoing performance and safety of the AI system, including monitoring for errors, biases or unintended consequences.
Data governance
The person responsible for the governance of the data used for operating, training or validating the AI system.
Where feasible, it is recommended that these 3 roles not all be held by the same person. The responsible officers should be appropriately senior, skilled and qualified for their respective roles.
9.2 Training of AI system operators
AI system operators play a crucial role in ensuring the responsible and effective use of AI. They must have the necessary skills, knowledge and judgment to understand the system’s capabilities and limitations, how to appropriately use the system, interpret its outputs and make informed decisions based on those outputs.
In your answer, describe the process for ensuring AI system operators are adequately trained and skilled. This may include:
Initial training
What training do operators receive before being allowed to use the AI system? Does this training cover technical aspects of the system, as well as ethical and legal considerations?
Ongoing training
Is there a process for continuous learning and skill development? How are operators kept up to date with changes or updates to the AI system?
Evaluation
Are operators’ skills and knowledge assessed? Are there any certification or qualification requirements?
Support
What resources and support are available to operators if they have questions or encounter issues?
Consider whether this needs to be tailored to the specific needs and risks of your AI system or proposed use case or whether general AI training requirements are sufficient.
-
10. Human-centred values
10.1 Incorporating diversity
Diversity of perspective promotes inclusivity, mitigates biases, supports critical thinking and should be incorporated in all AI system lifecycle stages.
AI systems require input from stakeholders from a variety of backgrounds, including different ethnicities, genders, ages, abilities and socio-economic statuses. This also includes people with diverse professional backgrounds, such as ethicists, social scientists and domain experts relevant to the AI application. Determining which stakeholders and user groups to consult, which data to use, and the optimal team composition will depend on your AI system.
The following examples demonstrate the often-unintended negative consequences of AI systems that failed to adequately incorporate diversity into relevant lifecycle stages.
AI systems ineffective at predicting recidivism outcomes for defendants of colour and underestimating the health needs of patients from marginalised racial and ethnic backgrounds.
AI job recruitment systems unfairly affecting employment outcomes.
Algorithms used to prioritise patients for high-risk care management programs were less likely to refer black patients than white patients with the same level of health.
An AI system designed to detect cancers had shown biases towards lighter skin tones stemming from an oversight in collecting a more diverse set of skin tone images, potentially delaying life-saving treatments.
Resources, including approaches, templates and methods to ensure sufficient diversity and inclusion of your AI system, are described in the NAIC’s Implementing Australia’s AI Ethics Principles report.
10.2 Human rights obligations
You should consult an appropriate source of legal advice or otherwise ensure that your AI use case and use of data align with human rights obligations. If you have not done so, explain your reasoning.
It is recommended that you complete this question after you have completed the previous sections of the assessment. This will provide more complete information to enable an assessment of the human rights implications of your AI use case.
In Australia, it is unlawful to discriminate on the basis of a number of protected attributes including age, disability, race, sex, intersex status, gender identity and sexual orientation in certain areas of public life, including education and employment. Australia's federal anti‑discrimination laws are contained in the following legislation:
- Age Discrimination Act 2004
- Disability Discrimination Act 1992
- Racial Discrimination Act 1975
- Sex Discrimination Act 1984.
Human rights are defined in the Human Rights (Parliamentary Scrutiny) Act 2011 as the rights and freedoms contained in the 7 core international human rights treaties to which Australia is a party, namely the:
- International Covenant on Civil and Political Rights (ICCPR).
- International Covenant on Economic, Social and Cultural Rights (ICESCR).
- International Convention on the Elimination of All Forms of Racial Discrimination (CERD).
- Convention on the Elimination of All Forms of Discrimination against Women (CEDAW).
- Convention against Torture and Other Cruel, Inhuman or Degrading Treatment or Punishment (CAT).
- Convention on the Rights of the Child (CRC).
- Convention on the Rights of Persons with Disabilities (CRPD).
-
11. Internal review and next steps
11.1 Legal review of AI use case
If the threshold assessment in section 3 results in a risk rating of ‘medium’ or ‘high’, your AI use case must undergo legal review to ensure that the use case and associated use of data meet legal requirements.
The nature of the legal review is context dependent. Without limiting the scope of legal review, examples of potentially applicable legislation, policies and frameworks are outlined at Attachment A of the Policy for the responsible use of AI in government.
If there are significant changes to the AI use case (including changes introduced due to recommendations from internal or external review), then the advice should be revisited to ensure the AI use case and associated use of data continues to meet legal requirements.
11.2 Risk summary table
To complete the risk summary table, list any:
- risks assessed in section 3 (the threshold assessment) as ‘medium’ or ‘high’
- instances where you have answered ‘no’ to questions in sections 4 to 10. You are encouraged to identify risk treatments in relation to these, however, you do not need to assign a residual risk rating to those risks
- additional risks that have been identified throughout the assessment process
- risk treatments identified during internal review (section 11.3) and, if applicable, external review (section 11.4) – using the risk matrix in section 3 to assess residual risk.
11.3 Internal review of AI use case
This requires an internal agency governance body designated by your agency’s Accountable Authority to review the assessment and the risks outlined in the risk summary table.
The governance body may decide to accept any ‘medium’ risks, to recommend risk treatments, or decide not to accept the risk and recommend not proceeding with the AI use case. You should list the recommendations of your agency governance body in the text box provided.
11.4 External review of AI use case
If, following internal review (section 11.3), there are any residual risks with a ‘high’ risk rating, your agency should consider whether the AI use case and this assessment would benefit from external review. This external review may recommend further risk treatments or adjustments to the use case.
In line with the APS Strategic Commissioning Framework, consider whether someone in the APS could conduct this review or whether the nature of the use case and identified risks warrant independent outside review and expertise.
Your agency must consider recommendations of an external review, decide which to implement, and whether to accept any residual risk and proceed with the use case. If applicable, you should list any recommendations arising from external review in the text box provided and record the agency's response to these recommendations.
-
Attachment
-
Attachment
Risk consequence rating advice
Negatively affecting public accessibility or inclusivity of government services
Insignificant
- Insignificant compromises to accessibility or inclusivity of services.
- Minor technical issues causing brief inconvenience but no actual barriers to access or inclusion.
- Issues rapidly resolved with minimal impact on user experience.
Minor
- Limited, reversable compromises to accessibility or inclusivity of services.
- Some people experience difficulties accessing services due to technical issues or design oversights.
- Barriers are short-term and addressed once identified, with additional support provided to people affected.
Moderate
- Many compromises are made to the accessibility or inclusivity of services.
- Considerable access challenges for a modest number of users.
- Resolving access issues requires substantial effort and resources.
- Certain groups may be disproportionately impacted.
- Affected users experience frustration and delays in receiving services.
Major
- Extensive compromises are made to the accessibility or inclusivity of services, may include some essential services.
- Ongoing delays that require external technical assistance to resolve.
- Widespread inconvenience, frustration, public distress and potential legal implications.
- Vulnerable user groups disproportionately impacted.
Severe
- Widespread irreversible ongoing compromises are made to the accessibility or inclusivity of services, including some essential services.
- Majority of users, especially vulnerable groups affected.
- Essential services inaccessible for extended periods, causing significant public distress, legal implications, and a loss of trust in government efficiency.
- Comprehensive and immediate actions are urgently needed to rectify the situation.
Unfair discrimination against individuals, communities or groups
Insignificant
- Negligible instances of discrimination, with virtually no discernible effect on individuals, communities, or groups.
- Issues are proactively identified and rapidly addressed before causing harm.
Minor
- Limited instances of unfair discrimination occur, affecting a small number of individuals.
- Relatively isolated cases, and corrective measures minimise their impact.
Moderate
- Moderate levels of discrimination leading to noticeable harm to certain individuals, communities, or groups.
- These incidents raise bias and fairness concerns and require targeted interventions.
Major
- Significant discrimination results in major, tangible harm to individuals and multiple communities or groups.
- Rebuilding trust requires substantial reforms and remediation efforts.
Severe
- Pervasive and systemic discrimination causes severe harm across a broad spectrum of the population, particularly marginalised and vulnerable groups.
- Public outrage, potential legal action, and a profound loss of trust in government.
- Immediate, sweeping reforms and accountability measures are required.
Perpetuating stereotyping or demeaning representations of individuals, communities or groups
Insignificant
- Inadvertently reinforce mild stereotypes, but these instances are quickly identified and rectified with no lasting harm or public concern.
- Minor
- Isolated cases of stereotyping, affecting limited members of community with some noticing and raising concerns.
- Prompt action mitigates the issue, preventing broader impact.
Moderate
- Moderate stereotyping by AI systems leads to noticeable public discomfort and criticism.
- Disproportionally affecting certain communities or groups.
- Requires targeted corrective measures to address and prevent recurrence.
Major
- Significant and widespread reinforcement of harmful stereotypes and demeaning representations.
- Causes public outcry and damages the relationship between communities and government entities.
- Urgent, comprehensive strategies are needed to rectify these representations and restore trust.
Severe
- Pervasive and damaging stereotyping severely harms multiple communities, leading to widespread distress.
- Potential legal consequences, and a profound breach of trust in government use of technology.
- Requires immediate, sweeping actions to address the harm, including system overhauls and public apologies.
Harm to individuals, communities, groups, businesses or the environment
Insignificant
- Inconsequential glitches with no real harm to the public, business operations or ecosystems.
- Easily managed through routine measures.
Minor
- Isolated incidents mildly affecting the public.
- Slight inconveniences or disruptions to businesses, leading to manageable financial costs.
- Limited manageable environmental disturbances affecting local ecosystems or resource consumption.
Moderate
- Noticeable negative effects on the public.
- Businesses face operational challenges or financial losses, affecting their competitiveness.
- Obvious environmental degradation, including pollution or habitat disruption, prompting public concern.
Major
- Significant public harm causing distress and potentially lasting damage.
- Significant harm to a wide range of businesses, resulting in substantial financial losses, layoffs, and long-term reputational damage.
- Compromises ecosystem wellbeing causing substantial pollution, loss of biodiversity, and resource depletion.
Severe
- Widespread, profound harm and severe distress affecting broad segments of the public.
- Profound damage across the business sector, leading to bankruptcies, major job losses, and a lasting negative impact on the economy.
- Comprehensive environmental destruction, leading to critical loss of biodiversity, irreversible ecosystem damage, and severe resource scarcity.
Compromising privacy due to the sensitivity, amount or source of the data being used by an AI system
Insignificant
- Insignificant data handling errors occur without compromising sensitive information.
- Incidents are quickly rectified, maintaining public trust in data security.
Minor
- Isolated exposure of limited sensitive data affects a small group of individuals.
- Swift actions taken to secure the data and prevent further incidents.
Moderate
- Breach of moderate amounts of sensitive data, leading to privacy concerns among the affected populace.
- Some individuals experience inconvenience and distress.
Major
- Serious misuse of sensitive private data affects a large segment of the population, leading to widespread privacy violations and a loss of public trust.
- Comprehensive measures are urgently required to secure data and address the privacy breaches.
Severe
- Significant potential to expose sensitive information of a vast number of individuals, causing severe harm, identity-theft risks; use of sensitive personal information in a way that is likely to draw public criticism with limited ability for individuals to choose how their information is used.
- Significant potential to harm trust in government-information handling with potential for lasting consequences.
Raising security concerns due to the sensitivity or classification of the data being used by an AI system
Insignificant
- Inconsequential security lapses occur without actual misuse of sensitive data.
- Quickly identified and corrected with no real harm done.
- These types of incidents may serve as prompts for reviewing security protocols.
Minor
- A limited security breach involves unauthorised access to protected data affecting a small number of records with minimal impact.
- Immediate actions secure the breach, and affected individuals are notified and supported.
- Incident is catalyst for review of security protocols.
Moderate
- Security incident leads to the compromise of a moderate volume of sensitive data, raising concerns over data protection and privacy.
- The breach necessitates a thorough investigation, enhanced security measures.
Major
- A significant security breach results in extensive unauthorised access to sensitive or protected data, causing considerable concern and distress among the public.
- Urgent security upgrades and support measures for impacted individuals are implemented. to restore security and trust.
Severe
- A massive security breach exposes a vast amount of sensitive and protected data, leading to severe implications for national security, public safety, and individual privacy.
- This incident triggers an emergency response, including legal actions, a major overhaul of security systems, and long-term support for those affected.
Raising security concerns due to implementation, sourcing or characteristics of the AI system
Insignificant
- Inconsequential security concerns arise due to characteristics of the AI system, such as software bugs, which are promptly identified and fixed with no adverse effects on overall security.
- These issues may serve as lessons, leading to slight improvements in the system's security framework.
Minor
- Certain characteristics of the AI system lead to vulnerabilities that are exploited in a limited manner, causing minor security breaches.
- Immediate remediation measures are taken, and the system is updated to prevent similar issues.
Moderate
- A moderate security risk is realised when intrinsic features of the AI system allow for unintended access or data leaks.
- Incident affects a noticeable but contained component of the AI system.
- Prompts a comprehensive security review of the AI system and the implementation of more robust safeguards.
Major
- Significant security flaws in the AI system's design result in major breaches, compromising a large amount of data and severely affecting system integrity.
- Incident leads to an urgent overhaul of security measures and protocols, alongside efforts to mitigate the damage.
Severe
- Critical security vulnerabilities inherent to the AI system lead to widespread breaches, exposing vast quantities of sensitive data and jeopardising national security or public safety.
- The incident results in severe consequences, necessitating emergency responses, extensive system redesigns, and long-term efforts to recover from the breach and prevent recurrence.
Influencing decision-making affects individuals, communities, groups, businesses or the environment
Insignificant
- Decisions lead to negligible errors, swiftly identified and corrected with no harm to the public, business operations or the environment.
- Incidents may serve as learning opportunity for system improvement.
Minor
- Decisions result in minor inconveniences or errors affecting the public, business operations or finances or slight environmental impacts.
- All impacts reversible with prompt action.
Moderate
- Decisions cause moderate harm to the public, business operations or finances or noticeable environmental degradation.
- Targeted interventions are required to mitigate these effects.
Major
- Significant harm to the public, substantial business financial losses or operational disruptions, or significant environmental damage.
- Loss of confidence in government, operations, service delivery and partnerships.
- Significant harm to a wide range of businesses, resulting in substantial financial losses, layoffs, and long-term reputational damage.
- Compromises ecosystem wellbeing causing substantial pollution, loss of biodiversity, and resource depletion.
Severe
- AI's influence on critical decision-making processes leads to severe and widespread harm to public, business operations or finances or the environment.
- Potentially endangering lives or significantly impacting public safety, rights and trust.
- Causes massive job losses, undermining business economic stability and viability.
- Catastrophic loss of ecosystems, endangered species, and long-term ecological imbalance or severe resources depletion.
Posing a reputational risk or undermining public confidence in the government
Insignificant
- Isolated reputational issues arise, quickly addressed and explained.
- Causes negligible damage to public trust in government capabilities.
Minor
- Small-scale AI mishaps lead to brief public concern, slightly denting the government's reputation.
- Prompt clarification and corrective measures minimize long-term impact on public confidence
- Seen by the government as poor management.
Moderate
- Misapplications result in moderate public dissatisfaction and questioning of government oversight.
- Requires remedial actions to mend trust and address concerns.
- Seen by government and opposition as failed management.
Major
- Widespread public scepticism and criticism, majorly affecting the government's image.
- Requires substantial efforts to rebuild public confidence through transparency, accountability, and improvement of AI governance.
- High profile negative stories, seen by government and opposition as significant failed management.
Severe
- Severe misuse or failure of AI systems leads to profound public distrust and criticism.
- Significantly undermining confidence in government effectiveness and integrity.
- Requires comprehensive, long-term strategies for rehabilitation of public trust, including systemic changes and ongoing engagement.
- Seen by government and opposition as catastrophic failure of management.
- Minister expresses loss of confidence or trust in agency.
Risk likelihood table
Likelihood Probability Description Almost certain 91% and above The risk is almost certain to eventuate within the foreseeable future. Likely 61–90% The risk will probably eventuate within the foreseeable future. Possible 31–60% The risk may eventuate within the foreseeable future. Unlikely 5–30% The risk may eventuate at some time but is not likely to occur in the foreseeable future. Rare Less than 5% The risk will only eventuate in exceptional circumstances or as a result of a combination of unusual events. -
2. Purpose and expected benefits
-
1. Basic information
1.1 AI use case profile
Complete the information below:
• Name of AI use case.
• Reference number.
• Lead agency.
• Assessment contact officer (name and email).
• Executive sponsor (name and email).1.2 AI use case description
In plain language, briefly explain how you are using or intend to use AI. 200 words or less.
1.3 Type of AI technology
Briefly explain what type of AI technology you are using or intend to use. 100 words or less.
1.4 Lifecycle stage
These stages can take place in an iterative manner and are not necessarily sequential. They are adapted from the OECD’s definition of the AI system lifecycle. Refer to guidance for further information. Select only one.
Which of the following lifecycle stages best describes the current stage of your AI use case?
- Early experimentation (note: assessment not required).
- Design, data and models
- Verification and validation
- Deployment
- Operation and monitoring
- Retirement
1.5 Review date
Assessments must be reviewed when use cases either move to a different stage of their lifecycle or significant changes occur to the scope, function or operational context of the use case. Consult the Guidance and, if in doubt, consult the DTA.
Indicate next date/milestone that will trigger the next review of the AI use case.
1.6 Assessment review history
Record the review history for this assessment. Include the review dates and brief summaries of changes arising from reviews (50 words or less).
-
3. Threshold assessment
3.1 Risk assessment
Using the risk matrix, determine the severity of each of the risks in the table below, accounting for any risk mitigations and treatments. Provide a rationale and an explanation of relevant risk controls that are planned or in place. The guidance document contains consequence and likelihood descriptors and other information to support the risk assessment.
The risk assessment should reflect the intended scope, function and risk controls of the AI use case. Keep the rationale for each risk rating clear and concise, aiming for no more than 200 words per risk.
Risk matrix Likelihood/Consequence Insignificant Minor Moderate Major Severe Almost certain Medium Medium High High High Likely Medium Medium Medium High High Possible Low Medium Medium High High Unlikely Low Low Medium Medium High Rare Low Low Low Medium Medium What is the risk (low, medium or high) of the use of AI:
- Negatively affecting public accessibility or inclusivity of government services?
- Unfairly discriminating against individuals, communities or groups?
- Perpetuating stereotyping or demeaning representations of individuals, communities or groups?
- Harming individuals, communities, groups, organisations or the environment?
- Raising privacy concerns due to the sensitivity, amount or source of the data being used by an AI system?
- Raising security concerns due to the sensitivity or classification of the data being used by an AI system?
- Raising security concerns due to the implementation, sourcing or characteristics of the AI system?
- Influencing decision-making that affects individuals, communities, groups, organisations or the environment?
- Posing a reputational risk or undermining public confidence in the government?
3.2 Assessment contact officer recommendation
If the assessment contact officer is satisfied that all risks in the threshold assessment are low, then they may recommend that a full assessment is not needed and that the agency accept the low risk.
If one or more risks are medium or above, then a full assessment must be completed, unless you amend the AI use scope, function or risk controls such that the assessment contact officer is satisfied that all risks in the threshold assessment are low.
You may decide not to accept the risk and not proceed with the AI use case.
The assessment contact officer recommendation should include:
- the statement ‘a full assessment is/is not necessary for this use case’
- comments (optional)
- name and position
- date.
3.3 Executive sponsor endorsement
The executive sponsor endorsement should include:
- the statement ‘I have reviewed the recommendation, am satisfied by the supporting analysis and agree that a full assessment is/is not necessary for this use case’
- comments (optional)
- name and position
- date.
-
4. Fairness
-
For each of the following questions, indicate either yes, no or N/A, and explain your answer.
4.1 Defining fairness
Do you have a clear definition of what constitutes a fair outcome in the context of your use of AI?
Where appropriate, you should consult relevant domain experts, affected parties and stakeholders to determine how to contextualise fairness for your use of AI. Consider inclusion and accessibility. Consult the guidance document for prompts and resources to assist you.
4.2 Measuring fairness
Do you have a way of measuring (quantitatively or qualitatively) the fairness of system outcomes?
Measuring fairness is an important step in identifying and mitigating fairness risks. A wide range of metrics are available to address various concepts of fairness. Consult the guidance document for resources to assist you.
-
5. Reliability and safety
-
For each of the following questions, indicate either yes, no or N/A, and explain your answer.
5.1 Data suitability
If your AI system requires the input of data to operate, or you are training or evaluating an AI model, can you explain why the chosen data is suitable for your use case?
Consider data quality and factors such as accuracy, timeliness, completeness, consistency, lineage, provenance and volume.
5.2 Indigenous data
If your AI system uses Indigenous data, including where any outputs relate to Indigenous people, have you ensured that your AI use case is consistent with the Framework for Governance of Indigenous Data?
Consider whether your use of Indigenous data and AI outputs is consistent with the expectations of Indigenous people, and the Framework for Governance of Indigenous Data (GID). See definition of Indigenous data in guidance material.
5.3 Suitability of procured AI model
If you are procuring an AI model, can you explain its suitability for your use case?
May include multiple models or a class of models. Includes using open-source models, application programming interfaces (APIs) or otherwise sourcing or adapting models. Factors to consider are outlined in guidance.
5.4 Testing
Outline any areas of concern in results from testing. If testing is yet to occur, outline elements to be considered in testing plan (for example, the model’s accuracy).
5.5 Pilot
Have you conducted, or will you conduct, a pilot of your use case before deploying?
If answering ‘yes’, explain what you have learned or hope to learn in relation to reliability and safety and, if applicable, outline how you adjusted the use of AI.
5.6 Monitoring
Have you established a plan to monitor and evaluate the performance of your AI system?
If answering ‘yes’, explain how you will monitor and evaluate performance.
5.7 Preparedness to intervene or disengage
Have you established clear processes for human intervention or safely disengaging the AI system where necessary (for example, if stakeholders raise valid concerns with insights or decisions or an unresolvable issue is identified)?
See guidance document for resources to assist you in establishing appropriate processes.
-
6. Privacy protection and security
-
For each of the following questions, indicate either yes, no or N/A, and explain your answer.
6.1 Minimise and protect personal information
Are you satisfied that any collection, use or disclosure of personal information is necessary, reasonable and proportionate for your AI use case?
See guidance on data minimisation and privacy enhancing technologies.6.2 Privacy assessment
Has the AI use case undergone a Privacy Threshold Assessment or Privacy Impact Assessment?
6.3 Authority to operate
Has the AI system been authorised or does it fall within an existing authority to operate in your environment, in accordance with Protective Security Policy Framework (PSPF) Policy 11: Robust ICT systems?
Engage with your agency’s IT Security Adviser and consider the latest security guidance and strategies for AI use (such as Engaging with AI from the Australian Signals Directorate).
-
7. Transparency and explainability
Connect with the digital community
Share, build or learn digital experience and skills with training and events, and collaborate with peers across government.