Authors: Shankar Sridharan, A Catherine Peters, B Luke Readman, C Bruno Botelho, D Andrew Taylor, E Neil Sebire, F and Robert Robinson. G
The NHS currently has an erratic approach to assessing whether software technologies, including clinical decision support systems, automation tools, and algorithmic or generative AI systems it adopts, work and have proven benefits. Previous strategies have focused on limiting technology choices and enforcing standardised solutions, leading to poor outcomes and wasted resources.
We propose a different approach, that does not attempt to standardise which technologies are used, but provides a comprehensive framework to select the best technology.
The challenge of technology adoption in the NHS
The NHS faces mounting pressures, exacerbated by the pandemic, leading to longer waiting times and delayed treatments. The UK Government has proposed increasing NHS appointments by 40,000 per week to alleviate these pressures [1]. A core strategy for achieving this involves leveraging technology, including artificial intelligence (AI) and digital innovation, to enhance patient care, clinician efficiency, and operational sustainability [2]. However, the NHS lacks a standardised framework to evaluate whether a technology is clinically effective, safe, and suitable for integration into NHS workflows, leading to significant risks [3]. The concept of ‘safety’ itself is broad, encompassing clinical risk, data security, and cybersecurity. This can lead to inconsistencies in the criteria used for technology selection and adoption. This has resulted in wasted investments, failure to achieve expected transformations, and, in some cases, unforeseen patient safety risks [4].
An NHS technology evaluation framework.
To address the growing challenges of evaluating digital health solutions, NHS TEST (Technology Evaluation Safety Test) was developed at the request of NHS England (London Region). TEST is a practical, technology evaluation framework designed to support the safe and effective scaling of any type of technology across the NHS.
The framework ensures that technologies meet rigorous assurance standards and demonstrate clear evidence of benefit to both patients and healthcare professionals, in line with NHS England guidelines [5]. By standardising evaluation criteria, NHS TEST enables the selection of digital solutions that are not only safe and effective but also scalable across different care settings. TEST complements existing national guidance, such as the NICE manual for evaluating health technologies (including digital tools), by offering a focused, operationally useful alternative to lengthy and complex policy documents. It provides healthcare organisations with a simple and structured approach to technology assessment, streamlining decisions around implementation and adoption.
The framework employs a dual evaluation model, focusing on both platform assurance (foundational technical and regulatory compliance) and demonstrable benefits to support healthcare delivery:
The dual evaluation model can be adapted or expanded to assure a specific technology type seeking national scale and clinical integration. By embedding NHS TEST into national procurement and digital strategies, the NHS can accelerate responsible innovation, optimise resource allocation, and strengthen public trust.
In this instance, TEST has been specifically applied to Ambient Voice Technology (AVT), a rapidly emerging digital tool aimed at reducing documentation burden and improving clinical efficiency. TEST may also support other solutions beyond AVT as the framework may either be used or be iterated and adapted to fulfil new requirements.
TEST is not guidance, but a practical tool to assess whether a technology is sufficiently assured and demonstrates proven benefit to justify scaling. As such, technologies being considered for wider adoption must be held to a higher standard than those in local use or early development.
Moral hazard leads to poor choices
Technology adoption in healthcare is often influenced by ‘moral hazard’ - where decision-makers or technology providers, who may not bear responsibility for the outcomes, promote digital solutions without a comprehensive understanding or sufficient evidence of value [6]. Vendors may push their products into the NHS using persuasive marketing or small-scale pilots rather than robust clinical validation. Similarly, procurement decisions may prioritise cost savings or vendor relationships over long-term patient safety and effectiveness. Additionally, ‘positivity advocacy bias', where enthusiastic clinicians and administrators support a technology without full visibility of associated risks, can lead to premature adoption of digital tools, exposing the NHS to inefficiencies and financial loss. Without rigorous oversight, ineffective or unsafe technologies can embed themselves into clinical practice before their true value is assessed [7].
Pilots without progress
Historically, NHS technology evaluation has been fragmented, with site-specific assessments leading to inconsistent adoption, variable clinical outcomes, and underutilised resources [8]. Many NHS hospitals run pilot studies not as a structured approach to evaluating new technology, but as a reaction to fear of missing out (FOMO). When one NHS Trust trials a particular innovation, there can be immense pressure for other Trusts to follow suit, whether or not the technology has been proven to deliver real benefits. Rather than being rigorous, data-driven assessments, these pilots are often underpowered, lack clear evaluation criteria, and have no strategic plan for scaling up if successful. As a result, the NHS becomes trapped in a cycle of ‘perpetual piloting’, where new technologies are trialled repeatedly but never properly assessed, deployed or implemented at scale. This approach hinders progress and allows ineffective solutions to linger, diverting time, resources, and staff effort away from meaningful improvements. Without an effective, outcome-focused framework for selecting and adopting technology, the NHS risks investing in innovation that looks promising on the surface but fails to deliver real impact, reinforcing a culture where pilots are run for optics rather than genuine transformation [9].
Ambient Voice Technology – a case for why we need TEST now
While some safety assessments exist, they vary considerably, leaving gaps in data security, cybersecurity, and clinical safety [10]. Evaluating the benefits of AI-driven healthcare technologies is particularly complex due to their evolving nature.
Ambient Voice Technology (AVT) with Generative AI is a relatively new technology. AVT comprises an AI agent that listens to a conversation between a clinician and a patient and creates 95% of the clinical documentation [11]. There is always a requirement for the clinician to check the output. The rapid introduction of AI technologies such as AVT, combined with the speed at which these technologies are evolving, creates potential safety risks and a lack of clarity as to how to select and implement these technologies for NHS use [12]. Many clinicians, as well as some GP practices, have been utilising technologies such as AVT without checking that these solutions have proven benefit in the NHS or are fully compliant with IG and cybersecurity governance [13].
Assessing what counts
TEST prioritises clinical effectiveness, cost-effectiveness, and workforce impact. A core principle of NHS TEST is to ensure that technologies meaningfully improve clinical care. Without robust evidence of NHS effectiveness, new systems may be introduced without delivering tangible patient benefits. Technologies must undergo rigorous evaluation through randomised controlled trials (RCTs), large-scale studies, or extensive NHS pilots. Studies should be sufficiently powered to show evidence of real-world benefit in the NHS to qualify for Gold Certification (meets criteria for wider NHS implementation). This level of scrutiny ensures that technologies selected for scaling across the NHS have a proven impact, reducing investment waste and patient safety concerns.
Section A is binary, confirming that platforms meet set assurance criteria, but we are mindful that there is a need for a benefits scoring approach (Section B) that will evolve in an iterative manner and is currently being validated. As the tool is vendor agnostic, the score produced will also support the ranking of different suppliers.
Technologies that demonstrate substantial benefits in cost-effectiveness and workforce impact—even without full clinical trials—can still receive Silver Certification (meets criteria for use in limited locations or workflows). If a tool reduces costs, optimises resources, or alleviates clinician burden, it may be considered for NHS use, provided it meets at least 50% of the operational efficiency and workforce impact criteria.
The NHS TEST framework is adaptable for a range of innovations in healthcare and ensures consistency in evaluation processes across different digital technologies while allowing for necessary domain-specific adaptations.
Finding balance
The NHS must ensure that the technology it adopts is safe, effective, and sustainable before it is widely deployed. A common concern with evaluation frameworks is that excessive regulation may hinder innovation. NHS TEST will accelerate adoption by providing clear benchmarks for safety, effectiveness, and interoperability. This structured roadmap offers suppliers a predictable path to NHS approval, eliminating uncertainties in approval and procurement decisions. By promoting a more efficient development pipeline, TEST supports both start-ups and established technology vendors in designing NHS-ready solutions.
As technology evolves, NHS TEST will continue to adapt and undergo iterative development. Regular updates will ensure alignment with digital transformation initiatives and emerging clinical needs. Longitudinal studies should assess the long-term impact of NHS TEST-approved technologies on care quality and cost-effectiveness.
TEST is not designed to restrict individual choice in technology, but to establish a consistent, evidence-based approach in assessing new technologies. This approach will ensure that what we buy genuinely works and has evidence of meaningful benefit.