Mindset XR Module 18: Analysis and interpretation

Mental Health

Welcome to the Mindset Extended Reality (XR) Innovation Support Programme learning resources, which include three series delivered in conjunction with our expert Mindset-XR programme partners:

• Medical regulation

• Clinical evidence

• Lived experience involvement

Mindset-XR is helping to catalyse the growth of immersive digital mental health solutions in the UK, through funding, tailored support and training. It is delivered by Innovate UK and the Health Innovation Network South London (HIN).

This series focuses on research and clinical evidence, with key insights from King's College London's Institute of Psychiatry, Psychology and Neuroscience. Across a number of modules, these resources will guide you through your research journey, from establishing what you plan to investigate, to conducting research and disseminating your findings.

Outline

Welcome to Module 18: Analysis and interpretation. This module will be introducing the considerations when analysing and interpreting XR research.

In this section, we're focusing on:

Types of data

Inferential statistics

Qualitative analysis

User experience and usability

Psychological and biometric analysis

Results interpretation

Common limitations

Types of data

Data is used to quantify an aspect of your research. There are different types of data we can collect in research, and each will have implications for the types of analysis and conclusions that we can draw.

There are scales rating questionnaires that can be used to collect numerical information. Data collected in this way can be continuous, meaning they can offer an absolute quantification of an aspect.

Example:

Think about scales to measure for instance, depressive symptom severity. They offer a value and is a numerical value within a range, from zero to fifty for example. If we take measurement of this type, it will allow us compare individual scores, and estimate difference in depression severity between individuals.

Some measures can also have what's called normative values, or standard score scores. These allow us to compare a score with general population norms. This is often the case for measures of cognition or thinking skills.

Using continuous and validated measures is often preferred in research trials as this can allow quantification of change. For instance, before and after an intervention, and comparison between the intervention and control groups.

Another type of data that is often used in research is categorical data. This is information that help us put individuals in different categories based on whether they meet a certain criteria. In research, this information is used to help to describe participants or to form groups to compare.

This is the case for, for example, having a diagnosis of PTSD or not having it. Or for some personal characteristics, such as gender or ethnicity.

Another example:

Offering the same intervention for anxiety to males and females and compare them on their anxiety reduction after therapy.

Qualitative information which is made up of people's opinions, discussions, and verbal feedback interviews.

These are often, captured by recordings of people views and performing specific analysis on these. The information gathered using these methods tends to be richer, but difficult to quantify and generalise. Its interpretation is also more likely to suffer from some form of bias.

Descriptive statistics are used to summarise and describe features in dataset. This approach can give an idea of the central tendency of a distribution, and provide information such as the mean, the median, and the mode. These can help to understand average behavior of users and of dispersion. So, indexes like range, variants, and standard deviation, are often used to understand the variability in user experience and distribution like: frequency distribution, histograms, and bar charts, to visualize data pattern.

Visualizing and exploring the data and the exploratory data analysis can be useful to uncover underlying patterns, anomalies, and relationship. Visual techniques include scatter plots, box plots, heat maps, and cluster data. These are helpful to visualise the trends.

Inferential statistics

Inferential statistics involve making predictions about a population based on a sample of data drawn from that population.

As it is not possible to recruit every patient with a particular condition, inferential statistics make inferences on the whole population based on a subset of individuals selected from that population.

It is useful to know that population and sample may have some significant differences. The population is often referred as the reference group. This is the entire group of individuals, or instances about whom we ought to learn and generalise the result to. The sample is a subset of that population who we try to ensure are as representative as possible of the whole population.

Inferential statistics aims to test hypotheses, so that at the end of the evaluation, when data is analysed, the research team should be able to reliably say if a hypothesis is supported.

The terms null hypothesis and alternative hypothesis are often used when considering inferential statistics methods.

Inferential statistics also use margins and certainty to confirm statements. A common term used in inferential statistics is the P value, which is the probability of observing data, or something more extreme within a center confidence interval. A small P value, typically below 0.05, indicates strong evidence against a null hypothesis.

Another relevant concept in inferential statistics is confidence interval. This is a range of values derived from the sample that is likely to contain the true population parameter or effect that we are looking for.

Inferential statistics is not free of errors. There are two types of errors that can occur:

There are a large number of statistical tests, but common ones used in projects that you might be undertaking are:

Tests comparing means like the t-test or ANOVAs.

Tests comparing categorical variables such as chi-square.

Tests looking at a relationship between continuous variables like correlations and regressions

Qualitative analysis

Qualitative methods provide in depth insights into user experiences and perceptions, often through non-numeric data. Common methods of qualitative analysis are:

User experience and usability

User experience and usability testing focuses on understanding the user interaction and satisfaction with the product and the product that is being developed.

The different methods:

Psychology and biometric analysis

Analysing physiological responses can also provide useful insight into the experience of participants.

Results interpretation

Always make interpretation based on data.

Some considerations for interpreting data:

Even when research is conducted according to a solid protocol, it's not uncommon, due to issues encountered during the project, that you might not be able to answer your research question.

This could be to do with the treatment malfunctioning, missing data, or difficulties with recruitment.

It is useful and helpful to be honest about those, and limit the interpretation.

You might be tempted to stretch your interpretations, but resist the urge.

If you study the pilot and the main aim is to assess the acceptability of a VR environment intervention, refrain from making a conclusion, or conclusive statement, about the efficacy of that intervention.

How far is it possible to generalise your result?

Consider similarities and differences between the conditions of your research and other environments.

So if you test a VR procedure in your lab, would it have the same effect and uptake if used in in schools? Do consider those aspects when you try to generalise the results of your study.

How clinically meaningful is an improvement?

If the aim of your XR product is, for example, to reduce impulsive behaviors associated with ADHD - how meaningful it is to people with the condition that reduction in impulsive behavior?

Clinical usefulness is something that requires expert by experience input to be evaluated, and often research try to do their interpretation of it. For more information on experts by experience, refer to module 11.

How your intervention or procedure improvement compares to the status quo?

Using standardised assessment tools and standardised scores that allows you to compare your intervention to other interventions.

Standardised scores such as the t-scores, z-scores, and d-scores, define the effect of an intervention, in standard term, and that allows us to compare it with other interventions of the status quo.

Common limitations

Sample size
Many VR studies in mental health involve relatively small sample sizes due to the specialised nature of the technology. These limits the ability to generalise the finding to a broader population.
Diversity
Participants might not be representative of the wider population, particularly if the study sample lacks diversity.
Technology
There might be variations in the VR hardware and software used at sets and VR environments. These can introduce inconsistencies. Results might differ based on the technology used and might be making it difficult to compare across different studies.These issues also compounded by the VR technology rapidly evolving and hardware and software used in the study might soon be replaced by more advanced or different systems.
Immersion
The degree of immersion and presence by a participant can vary significantly even within the same study due to factors such as: individual susceptibility to VR, the quality of VR environment, and the specific mental condition being considered. These can all influence how immersive that experience feels to different participants.
Cybersickness
Some participants might experience VR sickness, or cyber sickness, which includes symptoms like nausea,dizziness, headaches. These can affect their ability to engage with the VR environment and may skew results if some participants are unable to fully participate, or drop out. Instances of cyber sickness should be recorded and the procedure should be improved if these are common.
Controlled conditions
If a control condition is planned, it might be difficult to have one that controls for all the elements of VR. This might be difficult to establish an appropriate control group as comparing to a non VR condition,there might be too many differences.
Novelty
Novelty is a common feature of VR based procedure. This is people tending to respond more positively to VR intervention simply because it is novel and engaging rather than because it has therapeutic value.
Long-term effect
Studies often focus on short-term effects due to the challenges of following up participant of a long period. This can limit assessing the lasting impact of a VR intervention.

Got questions, comments or feedback?

Get in touch with the team

hin.mindset@nhs.net

Mindset XR Module 18: Analysis and interpretation