• Copilot survey questions

    We have published the questions from the Microsoft 365 Copilot trial’s evaluation surveys to support organisations to create their own generative AI evaluations.

    The evaluation was designed in consultation with the Australian Centre for Evaluation.

    More information about the evaluation approach can be found in Appendix B of the full report

    About the surveys

    Participants in the whole-of-government Copilot trial were encouraged to complete all 3 evaluation surveys, including the:

    • pre-use survey, first issued on 29 February 2024
    • mid-trial ‘pulse’ survey, first issued on 3 May 2024
    • post-use survey, first issued on 2 July 2024.

    Information collected

    The surveys were designed to keep participant responses anonymous but still allow the evaluation team to link responses across the 3 surveys.

    Participants were required to consent to their responses being used in the evaluation. They also had the option to allow their free-text answers to be quoted anonymously in future reports.

    The surveys also collected participants’:

    • APS classification
    • job family
    • managerial responsibilities.

    Optionally, participants could provide their gender identity to support cohort analysis.

  • Featured initiatives 

  • Pre-use survey questions
    QuestionOptions
    How long have you been using Copilot? (in weeks)Number
    Have you used, or do you currently use, any other generative AI tools besides Copilot?

    Single choice:

    • Never 
    • Yes - in a personal capacity
    • Yes - have used another generative AI tool to assist with work tasks

    How frequently do you use the following Microsoft products?

    • Teams
    • Outlook
    • Word
    • PowerPoint
    • Excel
    • OneNote
    • Loop
    • Whiteboard

    Single choice:

    • Rarely/never/NA
    • Once a month
    • Once a week
    • Few times a week
    • Daily
    How many meetings do you attend in an average week?

    Single choice:

    • 0
    • 1-5
    • 6-10
    • 11-15
    • 16-20
    • 21-25
    • 26-30
    • 31+
    Please estimate how much of your average week you spend on tasks that you could perform more effectively with the help of automation. These could include creating a slide deck, summarising long documents, writing weekly reports, taking meeting minutes, analysing survey results or preparing proposals.

    Single matrix choice:

    • Less than 20%
    • 20 - 40%
    • 40 - 60%
    • 60 - 80%
    • 80 - 100%

    Please estimate how often you experience the following:

    • I struggle to find the information or documents I need to complete my job
    • I spend too much time reviewing and responding to emails
    • I am required to switch tasks at short notice
    • There are many meetings I attend to just ‘receive information’ (I do not need to contribute to the discussion)
    • It can be difficult to keep on top of my workload
    • I don’t have enough dedicated focus time
    • I feel rushed and don’t feel I have put forward my best work

    Single matrix choice:

    • Not at all
    • Rarely / A few times a year
    • A few times a month
    • A few times a week
    • A few times a day
    • Most of the day

    Please estimate how often your role requires the following:

    • Summarising large amounts of text-based information
    • Undertaking data analysis
    • Writing code in a programming language
    • Preparing briefs and proposals
    • Preparing speeches and talking points
    • Preparing visual communication products e.g. presentations, slide decks, infographics, process diagrams
    • Preparing public communication products for diverse audiences (e.g. website/social media content, fact sheets, publications)
    • Undertaking user research and consultation
    • Working collaboratively on documents
    • Other

    Single matrix choice:

    • Not at all
    • Rarely / A few times a year
    • A few times a month
    • A few times a week
    • A few times a day
    • Most of the day

    Please estimate how many hours you spend on the following tasks in an average week:

    • Searching for information required for a task
    • Summarising existing information for various purposes (email updates, talking points, briefs, papers, minutes, etc.)
    • Preparing meeting minutes
    • Preparing first draft of a document 
    • Undertaking preliminary data analysis
    • Preparing slides (for both presentations and information)
    • Communicating through digital means other than meetings (1:1 calls, Teams chats etc)
    • Other

    Single matrix choice:

    • 0
    • 1 - 4
    • 5 – 8
    • 9 – 12
    • 13 – 16
    • 17 – 20
    • 21 – 24
    • 25+
    Which of the following best describes your sentiment about using Copilot?

    Single choice:

    • Very pessimistic
    • Slightly pessimistic
    • Neutral
    • Slightly optimistic
    • Very optimistic

    To what extent to you agree or disagree with the following statements?

    I believe Copilot will…

    • Improve the speed at which I complete tasks
    • Improve the quality of my work
    • Make valuable suggestions to enhance my work
    • Help me spend less mental effort on tedious or mundane tasks
    • Allow me to attend fewer meetings
    • Allow me to spend less time in emails
    • Allow me to quickly find the information I am looking for
    • Allow me to reduce task switching
    • Free up more focus time for important work
    • Be a net positive on my work

    Single matrix choice:

    • Strongly disagree
    • Disagree
    • Somewhat disagree
    • Neutral
    • Somewhat agree
    • Agree
    • Strongly agree
    Are there any other impacts you believe Copilot will have on your work not mentioned above?Free text

    If you have not started using Copilot yet, what Copilot features are you most looking forward to using?

    If you are already using Copilot, what Copilot features are you finding most useful?

    Free text
    Are there any cases where you believe Copilot will not be suitable for your work?Free text
    Do you have any other concerns about using Copilot?Free text
    Off
  • Mid-trial survey questions
    QuestionOptions
    In your use of Copilot, how much do you trust the results that are returned?

    Single choice:

    • 0
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    Do you manage staff who have access to Copilot?True/false
    Do staff consistently flag they have used Copilot to prepare their outputs?

    Single choice:

    • Never
    • Sometimes
    • About half the time
    • Most of the time
    • Always
    Are you confident you could recognise the difference between outputs produced with Copilot and those produced without?

    Single choice:

    • Definitely not
    • Probably not
    • Might or might not
    • Probably yes
    • Definitely yes
    As a manager of staff using Copilot, how much do you trust outputs you receive from staff prepared with the assistance of Copilot?

    Single choice:

    • 0
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    Thinking overall about your interactions with Copilot, how little or how much do you agree with the following statement: ‘Copilot has met my expectations.’

    Single choice:

    • Disagree
    • Somewhat disagree
    • Neutral
    • Somewhat agree
    • Agree

    Thinking about your interactions with Copilot, how little or how much do you agree with the following statement?

    Copilot features have met my expectations in…

    • Excel
    • Outlook
    • PowerPoint
    • Teams
    • Word

    Single matrix choice:

    • Disagree
    • Somewhat disagree
    • Neutral
    • Somewhat agree
    • Agree
    How little or how much do you agree with the following statement: ‘I feel confident in my skills and abilities to use Copilot.’

    Single choice:

    • Not at all confident
    • Not very confident
    • Moderate
    • Fairly confident
    • Very confident
    Off
  • Post-use survey questions
    QuestionOptions
    Which of the following best describes your sentiment about using Copilot after having used it?

    Single choice:

    • Very pessimistic
    • Slightly pessimistic
    • Neutral
    • Slightly optimistic
    • Very optimistic
    To what extent do you agree with the following statement: ‘I want to continue to use Copilot after the trial’.

    Single choice:

    • Strongly disagree
    • Disagree
    • Neutral
    • Agree
    • Strongly agree
    Please explain why you would or would not continue to use Copilot.Free text
    How frequently did you use Copilot during the trial?

    Single choice:

    • Not at all
    • A few times a month
    • A few times a week
    • A few times a day
    • Most of the day
    During the Trial, which Microsoft products have you used with Copilot features? Select all that apply.

    Single choice:

    • Copilot chat
    • Teams
    • Outlook
    • Word
    • PowerPoint
    • Excel
    • OneNote
    • Loop
    • Whiteboard

    (Conditional display)
    How frequently did you use the following Copilot chat features?

    • General queries
    • Interface with other Microsoft products
    • Document search / retrieval

    Single matrix choice:

    • Not at all
    • A few times a month
    • A few times a week
    • A few times a day
    • Frequently throughout the day

    (Conditional display)
    How frequently did you use the following Teams features?

    • Meeting summaries
    • Real-time answers
    • Task management

    Single matrix choice:

    • Not at all
    • A few times a month
    • A few times a week
    • A few times a day
    • Frequently throughout the day

    (Conditional display)
    How frequently did you use the following Outlook features?

    • Email summarisation
    • Drafting assistance
    • Meeting follow-up

    Single matrix choice:

    • Not at all
    • A few times a month
    • A few times a week
    • A few times a day
    • Frequently throughout the day

    (Conditional display)
    How frequently did you use the following Word features?

    • Summarisation
    • Rewrite suggestions
    • Tone adjustments
    • Formatting assistance

    Single matrix choice:

    • Not at all
    • A few times a month
    • A few times a week
    • A few times a day
    • Frequently throughout the day

    (Conditional display)
    How frequently did you use the following PowerPoint features?

    • Content creation
    • Design suggestions
    • Summarisation

    Single matrix choice:

    • Not at all
    • A few times a month
    • A few times a week
    • A few times a day
    • Frequently throughout the day

    (Conditional display)
    How frequently did you use the following Excel features?

    • Data analysis
    • Formula assistance
    • Insight generation

    Single matrix choice:

    • Not at all
    • A few times a month
    • A few times a week
    • A few times a day
    • Frequently throughout the day

    (Conditional display)
    How frequently did you use the following OneNote features?

    • Content generation
    • Summarisation
    • Task management
    • Collaboration enhancement

    Single matrix choice:

    • Not at all
    • A few times a month
    • A few times a week
    • A few times a day
    • Frequently throughout the day

    (Conditional display)
    How frequently did you use the following Loop features?

    • Draft page content
    • Content editing
    • Summarisation
    • Idea generation

    Single matrix choice:

    • Not at all
    • A few times a month
    • A few times a week
    • A few times a day
    • Frequently throughout the day

    (Conditional display)
    How frequently did you use the following Whiteboard features?

    • Idea generation
    • Content summarisation
    • Content organisation
    • Interactive collaboration

    Single matrix choice:

    • Not at all
    • A few times a month
    • A few times a week
    • A few times a day
    • Frequently throughout the day
    Which Copilot feature(s) had the most positive impact during the Trial? Please describe any particular use case(s) of how you used the feature(s) in your jobFree text
    What other generative AI products do you use in a work capacity to support your role? Select all that apply.

    Multi choice:

    • I do not use other generative AI products in a work capacity
    • ChatGPT
    • Gemini
    • Claude
    • Meta AI
    • Bing AI
    • Perplexity
    • Midjourney
    • GitHub Copilot
    • Other
    How does Copilot compare to other generative AI products you have used?

    Single choice:

    • Other generative AI products meet my needs significantly more than Copilot
    • Other generative AI products meet my needs slightly more than Copilot
    • Copilot and other generative AI products meet my needs to the same extent
    • Copilot meets my needs slightly more than other generative AI products
    • Copilot meets my needs significantly more than other generative AI products
    How little or how much do you agree with the following statement: ‘I feel confident in my skills and abilities to use Copilot.’

    Single choice:

    • Not at all confident
    • Not very confident
    • Moderately confident
    • Fairly confident
    • Very confident
    How did you learn to use Copilot? Select all that apply.

    Multi choice:

    • My agency provided Copilot training
    • Microsoft provided Copilot training
    • I found Copilot resources on the Internet
    • I experimented with Copilot to learn its functionalities
    • Other
    To what extent do you agree with the following statement: ‘The Copilot training I received was useful.’

    Single choice:

    • Strongly disagree
    • Disagree
    • Neutral
    • Agree
    • Strongly agree

    To what extent to you agree or disagree with the following statements?

    Using Copilot has…

    • Improved the speed at which I complete tasks
    • Improved the quality of my work
    • Made valuable suggestions to enhance my work
    • Helped me spend less mental effort on tedious or mundane tasks
    • Allowed me to attend fewer meetings
    • Allowed me to spend less time in emails
    • Allowed me to quickly find the information I am looking for
    • Allowed me to reduce task switching
    • Freed up more focus time for important work
    • Been a net positive on my work

    Single matrix choice:

    • Strongly disagree
    • Disagree
    • Somewhat disagree
    • Neutral
    • Somewhat agree
    • Agree
    • Strongly agree

    On average, how many hours per day has Copilot helped you save in the following areas?

    • Searching for information required for a task
    • Summarising existing information for various purposes (email updates, talking points, briefs, papers, minutes, etc.)
    • Preparing meeting minutes
    • Preparing first draft of a document
    • Undertaking preliminary data analysis
    • Preparing slides (for both presentations and information)
    • Communicating through digital means other than meetings (1:1 calls, Teams chats etc)
    • Attending meetings
    • Writing or reviewing code in a programming language
    • Other

    Single matrix choice:

    • Copilot has added time to this activity
    • 0
    • 0.5 – 1
    • 1 – 2
    • 2 – 3
    • 3 – 4
    • 4+
    • N/A (My job does not require me to complete this task)
    To what extent do you agree with the following statement: ‘Copilot has enabled me to allocate my time to perform tasks that are higher value and/or more complex.’

    Single choice:

    • Strongly disagree
    • Disagree
    • Neutral
    • Agree
    • Strongly agree

    (if agree or strongly agree)

    Please describe the tasks that are higher value and/or more complex that Copilot has enabled you to complete.

    Free text

    To what extent do you agree or disagree with the following statements?

    Copilot has improved the quality of my…

    • Searches for information required for a task
    • Summaries of existing information for various purposes (email updates, talking points, briefs, papers, minutes, etc.)
    • Preparation of meeting minutes
    • Preparations of the first draft of a document
    • Preliminary data analysis
    • Preparation of slides (for both presentations and information)
    • Communication through digital means other than meetings (1:1 calls, Teams chats etc) 
    • Coding
    • Other

    Single matrix choice:

    • Strongly disagree
    • Disagree
    • Neutral
    • Agree
    • Strongly agree
    • N/A (My job does not require me to complete this task)
    If you selected 'Other', please describe the output(s) delivered and Copilot's impact on quality.Free text
    Please describe how Copilot has affected the quality of your outputs.Free text

    (if managing staff)

    What is the impact of Copilot on the quality of your team's outputs?

    Single choice:

    • Negative
    • Somewhat negative
    • Neutral
    • Somewhat positive
    • Positive

    (if managing staff)

    What is the impact of Copilot on the quality of your team's outputs?

    Single choice:

    • Negative
    • Somewhat negative
    • Neutral
    • Somewhat positive
    • Positive

    (if managing staff)

    What is the impact of Copilot on the efficiency of your staff?

    Single choice:

    • Negative
    • Somewhat negative
    • Neutral
    • Somewhat positive
    • Positive
    Off
  • Copilot trial evaluation briefing

    About the briefing

    Following the release of the Microsoft 365 Copilot trial evaluation, the Digital Transformation Agency (DTA) hosted a public evaluation briefing. The briefing was held on Friday 25 October.

  • Video transcript

    Good morning everyone and thank you for joining us today. My name is Lucy Poole I'm the head of the division for Strategy, Planning and Performance here at the Digital Transformation Agency. I'd like to advise our attendees today that this session will be recorded, however, as it's a webinar only the presenters who present will be in the recording. Before I begin I would like to acknowledge the traditional custodians of the lands on which we are all meeting on today the various lands for me that is the Ngunnawal people and I would like to pay my respects to any First Nations people that may be joining us today. So firstly it's wonderful to see such interest in our briefing today I think our numbers are heading up towards 500 so that's an excellent turnout for us. As you can see from our agenda we've got a lot to cover of and in a moment um I'll explain the context for the trial within our wider work on AI in government. I'll then pass on to Lauren Mills who will give an overview of the evaluation approach findings and most importantly the recommendations and consequently what's next for government. Before we go any further I do need to let you know about a probity matter.  

    So, on the the 4th of October the DTA released an RFT on AusTender for the cloud Marketplace refresh.We also also working on the tender for the digital Marketplace panel too to maintain fairness we can't speak to nor answer questions about these processes today even if they see seem simple or confirm public information.If you have direct questions about them please reach out to the respective contact officers. Those details are shown on your screen.  

    I would also like to clarify from the outset that the Digital Transformation Agency is not responsible for how government departments or agencies choose to procure or adopt generative AI tools including co-pilot for Microsoft 365 . Now without other way let's go to setting the scene for today. First of all would like to um uh give thanks to everybody who's um who's here who provided questions ahead of today's session uh your questions have helped us to frame up the conversation that we'll have with you today and and with any luck we'll cover off the majority of those questions. Based on what you told us we're going to establish where the trial sits in the larger context of AI in government. What the recommendations mean for APS agencies and vendors and where we go from here. There are lots of specific questions that go deep into ways to use the product or very specific technical and security implications. We could talk for hours on these aspects but sadly we don't have the time so this morning's session will include some resources that go some way to explaining the wider picture of AI and government. It will also cover off the technical and security information about Microsoft 365 co-pilot. For those who haven't read it I highly encourage reading the full evaluation report in answers almost all the questions that won't be directly addressed today. We've also enabled the Q&A functions in Teams and our teams will try to answer as many of your questions as we can. Please do ask for follow-ups throughout the session today. We'll capture everything you ask and look to publish more information after the session to fill any any gaps that uh that we don't meet today. So let's move into the wider context for AI in government. For those of you that aren't familiar with the work of the Digital Transformation Agency, we are governments advisor for the development delivery and monitoring of whole of government strategies, policies standards for digital and ICT Investments and procurement. This includes setting the overall direction for how the Australian Government explores and procures and adopts new technology including generative AI. We aren't responsible for what we call whole of economy policy work on AI. Those that impact every business or person across Australia but we do ensure that our work aligns to this broader picture, because government should naturally be the exemplar of safe responsible use of AI that fulfills Australian AI ethics principles. We work very closely with the Department of Industry Science and Resources who are responsible for this Whole of economy piece. Together we co-led the AI and government task force through to the end of June this year. The work of that task force has directly informed the work of the DTA including the whole of government policy for the responsible use of AI. Accompanying standards and guidance to help agencies to fulfill the policies requirements.  

    Training and supporting guidance for APS staff the end users of generative AI. These instruments are available and were relevant applicable right now. We also have an ongoing slate of work which will see results through uh through the new year these include developing AI technical standards for use by government which will be openly available for use by other other governments organisations and Industry. We're also working on piloting the Australian government's own AI Assurance framework for positively managing the risks associated with different AI use cases, and continuing our work on progressively updating the AI policy to keep up with both changes in technology and also the expectations of the APS and the wider community.  

    It's in excuse me it's within this context that we undertook the trial of the Microsoft 365 co-pilot as an example of an a generative AI tool so that we can understand what impact this technology might begin to have on the way that public servants go about their work. That was the intent of the trial and its evaluation its impact on how people work. At the time of the trial while Microsoft 365 copilot was the most appropriate tool to undertake an evaluation of General generative AI capabilities in day-to-day technology Suites of the APS. Its integration with Office Products familiar to the our APS staff within existing whole of Government Contracting Arrangements allowed us to undertake the evaluation in a timeline that worked for our needs and in a way that would ensure a relatively consistent implementation and user experience across government. Now I'm starting to get in the weeds of the trial itself and how the evaluation worked so at this point I'll hand over to Lauren Mills thank you.  

    Thanks Lucy and good morning everyone I'm Lauren Mills I lead the strategy and prioritisation Branch here at the DTA I'm going to start with um a brief explanation of our evaluation approach noting the full report is available and it does go into a lot more detail for those that are interested I think the team are actually going to pop a link in the chat here so if you haven't had the opportunity to read it or you' just like to follow along today um you can grab that now. So as Lucy mentioned the tool we selected for the trial was Microsoft 365 Copilot which is ubiquitous about thethroughout the M365 Suite of products and so given this we thought that it wasn't the best approach to set a strict set of use cases for agencies to use. So for the avoidance of all doubt just to be clear we specifically looked at the Microsoft 365 Copilot and not other Copilot offerings. So defining use cases was going to be counterintuitive to the experimental nature of the trial and we wanted to give flexibility to agencies with their specific operating environments and their own requirements to effectively choose their own adventure and see what they could find with the product. This was really important because we had over 7,700 licenses purchased across 60 Australian government entities participating in the trial but we wanted to start up front with what do we need to find out through the course of the trial. So we worked closely with the AI and government task force and we identified four key outcome areas that we wanted to explore, so that was really around employee related outcomes. So what about staff sentiment in the use of copilot as an example of generative AI including staff satisfaction, opportunities for Innovation, confidence in the use of Copilot and how easy we could integrate to our existing workflows. Of course productivity was a key area we wanted to explore through the trial both in terms of efficiency but also in quality and whether there was opportunity for process improvements. We also wanted to look at the adoption of AI more broadly and to what extent copilot could be implemented in a safe and responsible way across government, how it could pose benefits and challenges in the short and longer term and also what barriers to Innovation do we have that might required changing the way that we deliver our services to embrace the opportunities of these new technologies. We also wanted to understand any unintended consequences both benefits and challenges of implementing co-pilot and the implications for broader adoption across the APS. So the report that you've seen publishes it includes the post evaluation findings but I also wanted to share some of the insights we we learned throughout the trial, as I mentioned one of the biggest challenges we we had was the sheer breadth of agencies who had signed on to participate in the trial. So every agency had a different level of maturity in their use of AI different operating environments and of course different risk appetites, however, although all very different across them these agencies there were a lot of common themes that came through and I'm just going to talk through some of those now. Initially security was a key focus area an agency's had various um levels of reliance on the Australian Cyber Security Center's Infosec registered assesses program or IRAP. IRAP assessments.  

    So while the IRAP actually assists agencies in their security assessments it is of course not mandatory across the APS, however, in many cases smaller agencies um particularly the smaller agencies have mandated it as part of their um establishing their ability to authority to operate so it was absolutely critical that we got that IRAP sorted straight straight away. Another key area was data governance while not a new risk for for us the nature of the tool the ease with which Copilot could highlight the access to all the documents and files that individual staff members had access to so that was something we needed to understand exactly from the start. Some agencies took advantage of it and actually used Copilot to undertake audits of their information systems before fully rolling out the product. Others were satisfied that they had the right risks risk management processes in place to identify and they put in processes to remediate anything found through the course of the trial. The other key area was privacy and similarly and in some cases directly related to data governance there were various levels of risk appetite in relation to privacy and agencies who held customer data actually applied a higher level of caution which makes sense, however, one of the biggest challenges was actually understanding what were the specific privacy considerations to generative AI. What made it different to other technologies? We had established a program board to govern the the trial and underneath this program board we established a privacy working group to unpack some of these privacy considerations in more detail. As part of this group we actually set up a cross APS privacy impact assessment which was coordinated by the department of Home Affairs to establish a base set of common assumptions and use cases. While obviously agencies are responsible for conducting their own assessments based on their own operating environments they could use this joint PIA to reduce the duplication of effort as well as to reduce costs and it was a really great example of how the APS can collaborate to efficiently solve problems and share our learnings. Finally another one was recordkeeping through the central issues register that we established for the trial there was a common theme around what constitutes a record under the archives act. Specifically for things like meeting recordings transcripts and of course first drafts of documents using Copilot . To unpack this further we established a second working group under the trials program board which worked in consultation with the National Archives of Australia to establish a whole of government advice on how these records should be treated. This work work remains ongoing but at a high level Copilot and other generative AI assistants could be viewed as simply another tool that staff may use to conduct their work and should have record retention periods that reflect that. This is not a catch all and there will be additional nuanced advice depending on the scenario or use case and so as I said this this work is continuing and we're hoping to get some whole government advice out. In terms of what we saw around success of the trial, through the course we saw that the agencies had the most um success in the adoption and the use of of this product were those that had already thought about their specific environments and the application of generative AI within their organisation, which makes sense and I think the other key finding which is around the it's quite common in change projects those who had champions particularly those who had strong executive sponsors promoting the benefits saw the highest adoption and were able to conduct robust internal evaluations.  

    So in terms of the post trial evaluation findings as I've mentioned the the reports are very detailed and will provide a really good source of information for for all of you so I'm going to cover off some of the findings just at a higher level only today. As I mentioned earlier there were four key areas that we looked at in terms of evaluation and so in terms of employee related outcomes we saw that most trial participants were positive about Copilot and wish to continue using it so 86% of trial participants said they wanted to keep using the product. Interestingly Senior Executive Service staff about 93% and corporate roles about 81% had the highest positive sentiment towards co-pilot however despite the positive sentiment use of copilot was moderate our analysis was conducted across both job families and those different levels across the APS and moderate usage was consistent across these classifications and job families but those specific use cases varied for example a higher proportion of SES and EL2 staff used the meeting summarisation features compared to other APS classifications which makes sense. Microsoft Teams and Word were the most frequently used and met participants needs, however, there was considered to be very poor Excel functionality and the access issues and Outlook did hamper use in that in that product. As expected content summarisation and rewriting were the most used Copilot functions but it was clear that other generative AI tools might be more effective and meeting users needs in terms of things like writing code generating images or searching research databases.  

    It's clear that tailored training and propagation of high value use cases could improve adoption, so we saw that training significantly enhanced confidence in the use of Copilot and was most effective when it was tailored to an agency specific context it's also important that we identify specific use cases for Copilot which will help as I said help that adoption and promote that use of the product. In productivity most trial participants believed Copilot improved the speed and quality of their work so improvements in both efficiency and quality were seen with perceived time savings around an hour a day for some people including uh for some of those tasks such as summarisation, preparing that first draft of a document or information services, and I think um really great to see that 40% of survey respondents reported reallocating their time for some important activities such as mentoring and culture building strategic planning engaging with stakeholders and product enhancement, however, copilot's inaccuracy did reduce the scale of these productivity benefits. So the gains on quality were more subdued relative to those efficiency gains and the potential unpredictability and lack of contextual knowledge in terms of those outputs from copilot required time um spent on that output verification which kind of negated some of the efficiency savings. In terms of adoption there is as I've said a need for agencies to engage in that planning activities about how they bring on board generative AI tools and making sure that those governance structures and processes appropriately reflect their risk appetites. Many of the insights under this outcome reflect what we found during the trial but some of the key barriers were the integration challenges with non-microsoft 365 applications, however, it should be noted that um these Integrations were actually out of scope for the trial. Prompt engineering identifying relevant use cases and understanding the information requirements of copilot across the Microsoft Office Products with significant capability barriers and of course that planning and and to reflect the role in release nature of gen AI tools alongside our relevant governance structures um is is important. And finally just some of the unintended outcomes some both benefits and concerns that we will need to be actively monitored throughout the adoption of AI.  

    So in terms of benefits a really interesting outcome was that Gen AI could improve inclusivity and accessibility in the workplace for particularly for those who are neurodiverse with a disability or from a culturally and linguistically diverse background and that the adoption of Copilot and gen more broadly could actually help the APS attract and retain employees. However, there were some concerns particularly around the potential impact of Gen AI on APS jobs and skills needs in the future. Also as we've seen more broadly that the outputs might be biased towards our Western norms and may not appropriately used cultural data and information such as misusing First Nations images and misspelling First Nation words . There was concerns that the the use of Gen AI might lead to a loss of skill in summarisation and writing or conversely that a lack of adoption of Gen AI may result in a false assumption that people who use it might be more productive than those who don't. Participants also expressed concerns relating to vendor lock in, however, the report found the realised benefits were limited to specific features and use cases as we've discussed. Finally participants were also concerned about the APS's increased impact on the environment resulting from Gen AI use. So in terms of the recommendations um the overarching findings reveal several considerations um for the APS in the context of future adoption of Gen AI and we put together eight recommendations in total across three focus areas so we need to ensure we do detailed and adaptive implementation so in terms of product selection agencies should consider which Gen AI solution is most appropriate for their overall operating environment and their specific use cases particularly for these AI assistant tools. In system configuration we must configure our information systems permissions and processes to safely accommodate Gen AI products. Specialised training is essential reflecting agency specific use cases and developing broader Gen AI capabilities including prompt training. As I discussed change management is key, effective change management should support the integration of Gen AI and potentially identifying Gen AI champions to highlight the benefits and encourage adoption. We need to develop clear guidance on using Gen AI including when consent and disclaimers are needed such as in meeting recordings and a clear articulation of accountabilities. We need to encourage greater adoption through analysing our workflows across various job families and classifications to identify further use cases that can approve adoption. We need to continue to share use cases, we've seen great collaboration and bringing together of knowledge across the APS around the use of this emerging technology and we need to continue that and look where we can share those appropriate whole of government forums to facilitate the adoption of Gen AI. And finally we've talked a lot about some of these impacts and we need to proactively monitor the impacts of generative AI including its effects on the workforce to manage current and emerging risks effectively. So what's next? Many of the questions we've received from you were future focused which is great to see the findings from the trial will directly inform the next iteration of the policy for the responsible use of AI and government as well as the AI Assurance framework which Lucy mentioned we are piloting right now. In addition to this we'll continue to explore the work of privacy and recordkeeping under those working groups and these will continue to remain in place post the conclusion of the trial and we're working closely with the National Archives Australia as well as the office of the Australian Information Commission to progress that work. In terms of Gen AI adoption across government the decision to adopt generative AI remains the responsibility of each agency, as Lucy may clear up front. We do play a role here at the DTA in supporting that decision-making through our policies and frameworks and access to vendors through the Digital Marketplace. Specific to Copilot we do understand there may be an uptake across the service and so we are currently finalising some technical readiness documentation to support agencies who do choose to implement Copilot. This suite of documents aims to support that safe and responsible implementation of the product within the Australian government context, and we're also working closely with the Australian Cyber Security center to complete that work. So that brings us to the end um of that section and moving into the Q&A part of today's agenda, as noted at the start of of the brief we have grouped our questions into key themes in order to cover as much as possible, so we're going to start with the questions that were asked through the registration process and where possible some that were raised today uh my colleagues I believe are also responding to questions through the Q&A function. Thank you so much Lauren, we have had a fairly significant disruption to our MS Team service, our end, which is meaning that we're not able to access the questions coming through nor even make some workarounds of the more manual type so what we're going to do is we're going to we're going to call it a day here and what we will do is endeavor to as I mentioned earlier that the presentation today will be shared uh online so you'll have full access to that but there will be more information that the Digital Transformation Agency looks to to push out based on the questions that we know that have come through that may not have been addressed throughout the session and I do appreciate the thumbs up and the clapping that's coming through you know. Whenever you're doing a live session you got to expect these things I guess.  

    So I'd like to just give thanks to both the DTA team in the background who are are you know frantically trying to fix the problem but also those that were that were enrolled directly in the pilot itself thank you to Lauren Mills and most importantly thank you all for attending today your interest is greatly appreciated and and we hope that the the information that we covered has been helpful for you. So thank you, enjoy the rest of your Friday and have a fabulous weekend thank you. 

     

    Off
  • Participants were given the opportunity to ask questions before, during and after the briefing. Below are answers to the most frequent questions and, where possible, additional information to support industry and Australian Public Service (APS) staff.

    Questions and answers

    What questions were asked in the evaluation surveys?

    You can access the questions for all 3 surveys on the Copilot trial survey page.

    Is government aware of bespoke and standalone generative AI products?

    The trial used Microsoft 365 Copilot to evaluate employee outcomes and productivity-related outcomes of general-use generative AI in the APS.

    Agencies may choose to explore bespoke, standalone or other use cases and may seek information on or procure solutions through the marketplaces on BuyICT.

    What should vendors consider if they wish to offer generative AI solutions or services to government?

    Vendors should make sure their AI offerings align to applicable policies, including:

    As with any technology, vendors should be familiar with:

    Are there plans for future trials of generative AI products from other vendors?

    As of January 2025, the Digital Transformation Agency has no plans to conduct further whole-of-government trials of generative AI products. APS agencies may conduct their own trials or evaluations.

    Will Copilot be offered to APS agencies as part of Microsoft's whole-of-government arrangements?

    APS agencies may choose to procure Microsoft 365 Copilot within the whole-of-government single-seller arrangement. 

    Will the Australian Government train its own generative AI model?

    As of January 2025, the Australian Government is not exploring a bespoke, whole-of-government generative AI model. Agencies may choose to procure, develop or collaborate on bespoke models to meet their specific needs.

    How were privacy or security concerns managed during the trial?

    As with any technology, agencies must apply relevant policies when using generative AI technologies.

    Before the trial, Microsoft commissioned an updated Infosec Registered Assessors Program (IRAP) assessment for its products that integrate with and enable Copilot features.

    This is available to tenant administrators on the Microsoft Service Trust portal. Agencies were also required to conduct a privacy impact assessment before deploying Microsoft 365 Copilot to their participating staff.

    The evaluation noted that, during the trial, agencies faced:

    The DTA publishes and maintains AI-specific resources to support agencies through these challenges. This includes the Australian Government’s pilot AI assurance framework and a suite of AI technical standards. They are due for release in 2025.

    Is there guidance available for APS agencies which develop, procure or deploy generative AI tools?

    APS agencies which develop, procure, deploy or use AI must comply with whole-of-government policies, standards and guidance.

    As with any technology, agencies must also align to other applicable policies, such as those related to:

    • procurement
    • cybersecurity
    • privacy
    • data protection and management
    • Indigenous data governance
    • transparency.

    Did the benefits of Copilot to agencies outweigh the costs?

    The trial evaluation did not assess the cost-benefit ratio for Microsoft 365 Copilot at a whole-of-government level. However, the overarching findings note that agencies should consider the costs of implementing Copilot and other generative AI products while they are in their early days.

    Agencies may choose to conduct cost-benefit evaluations specific to their operating environment, whether drawing upon their agency-level observations from the whole-of-government trial or while independently piloting generative AI products.

    Did the trial evaluate for accessibility benefits?

    While the trial did not directly evaluate accessibility benefits, some positive outcomes for inclusivity and accessibility were detailed in the full report.

    Did the trial evaluate environmental impacts?

    While the trial did not directly evaluate environmental impacts, the report detailed some concerns that were observed around the use of generative AI and the APS’s environmental footprint’.

    Did the trial benchmark or evaluate the accuracy of Copilot outputs?

    The trial did not benchmark or technically evaluate the accuracy of Microsoft 365 Copilot’s outputs.

    That said, participants reported that inaccuracy and unpredictability impacted their productivity. This could have implications for broader adoption of generative AI.

    The full evaluation methodology can be explored in Appendix B.

    Did the trial compare participants in technical and non-technical jobs?

    Differences in experience between APS classifications and job families can be explored across the employee-related outcomes and productivity chapters of the full report.

    Information about how the job families were aggregated and limitations, including positive bias sentiment, can be found in Appendix B.

    The rate of survey participation by job family can be found in Appendix D.

    What impact will generative AI have on the APS workforce?

    The full report makes several observations related to the impact of generative AI tools such as Microsoft 365 Copilot on workforces.

    Many of these are detailed in the unintended outcomes chapter.

    They include potential:

    • improvements to inclusivity and accessibility
    • staff attraction and retention
    • impacts on roles and employment opportunities
    • skills development and decay.

    The evaluation recommends proactive monitoring for current and emerging risks, including the effects on the workforce.

    Is further training required for APS staff to effectively adopt generative AI?

    The evaluation observed a positive relationship between training and capability. Participants found training to be more effective when tailored to an APS context.

    Recommendation 3 suggests agencies should offer specialised training based on their specific use cases.

    Meanwhile, whole-of-government policy strongly recommends minimum training for all staff as well as additional, role-based training. To help agencies fulfil this recommendation, the DTA has published an AI fundamentals training module.

  • Insights and seminars

    From time to time, the Digital Transformation Agency publishes insights and seminars to support government agencies adopt AI technologies.

    Implementing AI

    With government's unique accountability, financial and cultural constraints in mind, the discussion will help APS leaders and practitioners find the shared vocabulary and opportunities to innovate with AI technologies effectively.

    During a session titled Implementing AI, Kate Pounder (Board member, Amplify and former CEO, Technology Council of Australia), Doug Gray (Director of Data Science, Walmart Global Tech) and Dr Evan Shellshear (Adjunct Professor, QUT) discuss what makes innovative data science projects successful, with real-world examples from one of the world's leading tech innovation labs.

    This discussion was recorded on 16 October 2024.

  • Video transcript

    A video transcript will soon be made available.

    Off
  • Survey questions

  • Pre-use survey questions
    QuestionOptions
    How long have you been using Copilot? (in weeks)Number
    Have you used, or do you currently use, any other generative AI tools besides Copilot?

    Single choice:

    • Never 
    • Yes - in a personal capacity
    • Yes - have used another generative AI tool to assist with work tasks

    How frequently do you use the following Microsoft products?

    • Teams
    • Outlook
    • Word
    • PowerPoint
    • Excel
    • OneNote
    • Loop
    • Whiteboard

    Single choice:

    • Rarely/never/NA
    • Once a month
    • Once a week
    • Few times a week
    • Daily
    How many meetings do you attend in an average week?

    Single choice:

    • 0
    • 1-5
    • 6-10
    • 11-15
    • 16-20
    • 21-25
    • 26-30
    • 31+
    Please estimate how much of your average week you spend on tasks that you could perform more effectively with the help of automation. These could include creating a slide deck, summarising long documents, writing weekly reports, taking meeting minutes, analysing survey results or preparing proposals.

    Single matrix choice:

    • Less than 20%
    • 20 - 40%
    • 40 - 60%
    • 60 - 80%
    • 80 - 100%

    Please estimate how often you experience the following:

    • I struggle to find the information or documents I need to complete my job
    • I spend too much time reviewing and responding to emails
    • I am required to switch tasks at short notice
    • There are many meetings I attend to just ‘receive information’ (I do not need to contribute to the discussion)
    • It can be difficult to keep on top of my workload
    • I don’t have enough dedicated focus time
    • I feel rushed and don’t feel I have put forward my best work

    Single matrix choice:

    • Not at all
    • Rarely / A few times a year
    • A few times a month
    • A few times a week
    • A few times a day
    • Most of the day

    Please estimate how often your role requires the following:

    • Summarising large amounts of text-based information
    • Undertaking data analysis
    • Writing code in a programming language
    • Preparing briefs and proposals
    • Preparing speeches and talking points
    • Preparing visual communication products e.g. presentations, slide decks, infographics, process diagrams
    • Preparing public communication products for diverse audiences (e.g. website/social media content, fact sheets, publications)
    • Undertaking user research and consultation
    • Working collaboratively on documents
    • Other

    Single matrix choice:

    • Not at all
    • Rarely / A few times a year
    • A few times a month
    • A few times a week
    • A few times a day
    • Most of the day

    Please estimate how many hours you spend on the following tasks in an average week:

    • Searching for information required for a task
    • Summarising existing information for various purposes (email updates, talking points, briefs, papers, minutes, etc.)
    • Preparing meeting minutes
    • Preparing first draft of a document 
    • Undertaking preliminary data analysis
    • Preparing slides (for both presentations and information)
    • Communicating through digital means other than meetings (1:1 calls, Teams chats etc)
    • Other

    Single matrix choice:

    • 0
    • 1 - 4
    • 5 – 8
    • 9 – 12
    • 13 – 16
    • 17 – 20
    • 21 – 24
    • 25+
    Which of the following best describes your sentiment about using Copilot?

    Single choice:

    • Very pessimistic
    • Slightly pessimistic
    • Neutral
    • Slightly optimistic
    • Very optimistic

    To what extent to you agree or disagree with the following statements?

    I believe Copilot will…

    • Improve the speed at which I complete tasks
    • Improve the quality of my work
    • Make valuable suggestions to enhance my work
    • Help me spend less mental effort on tedious or mundane tasks
    • Allow me to attend fewer meetings
    • Allow me to spend less time in emails
    • Allow me to quickly find the information I am looking for
    • Allow me to reduce task switching
    • Free up more focus time for important work
    • Be a net positive on my work

    Single matrix choice:

    • Strongly disagree
    • Disagree
    • Somewhat disagree
    • Neutral
    • Somewhat agree
    • Agree
    • Strongly agree
    Are there any other impacts you believe Copilot will have on your work not mentioned above?Free text

    If you have not started using Copilot yet, what Copilot features are you most looking forward to using?

    If you are already using Copilot, what Copilot features are you finding most useful?

    Free text
    Are there any cases where you believe Copilot will not be suitable for your work?Free text
    Do you have any other concerns about using Copilot?Free text
    Off
  • Mid-trial survey questions
    QuestionOptions
    In your use of Copilot, how much do you trust the results that are returned?

    Single choice:

    • 0
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    Do you manage staff who have access to Copilot?True/false
    Do staff consistently flag they have used Copilot to prepare their outputs?

    Single choice:

    • Never
    • Sometimes
    • About half the time
    • Most of the time
    • Always
    Are you confident you could recognise the difference between outputs produced with Copilot and those produced without?

    Single choice:

    • Definitely not
    • Probably not
    • Might or might not
    • Probably yes
    • Definitely yes
    As a manager of staff using Copilot, how much do you trust outputs you receive from staff prepared with the assistance of Copilot?

    Single choice:

    • 0
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    Thinking overall about your interactions with Copilot, how little or how much do you agree with the following statement: ‘Copilot has met my expectations.’

    Single choice:

    • Disagree
    • Somewhat disagree
    • Neutral
    • Somewhat agree
    • Agree

    Thinking about your interactions with Copilot, how little or how much do you agree with the following statement?

    Copilot features have met my expectations in…

    • Excel
    • Outlook
    • PowerPoint
    • Teams
    • Word

    Single matrix choice:

    • Disagree
    • Somewhat disagree
    • Neutral
    • Somewhat agree
    • Agree
    How little or how much do you agree with the following statement: ‘I feel confident in my skills and abilities to use Copilot.’

    Single choice:

    • Not at all confident
    • Not very confident
    • Moderate
    • Fairly confident
    • Very confident
    Off

Connect with the digital community

Share, build or learn digital experience and skills with training and events, and collaborate with peers across government.