Exploring in-person self-led debriefings for groups of learners in simulation-based education: an integrative review

Background Facilitator-led debriefings are well-established for debriefing groups of learners in immersive simulation-based education. However, there has been emerging interest in self-led debriefings whereby individuals or groups of learners conduct a debriefing themselves, without the presence of a facilitator. How and why self-led debriefings influence debriefing outcomes remains undetermined. Research aim The aim of this study was to explore how and why in-person self-led debriefings influence debriefing outcomes for groups of learners in immersive simulation-based education. Methods An integrative review was conducted, searching seven electronic databases (PubMed, Cochrane, Embase, ERIC, SCOPUS, CINAHL Plus, PsychINFO) for peer-reviewed empirical studies investigating in-person self-led debriefings for groups of learners. Data were extracted, synthesised, and underwent reflexive thematic analysis. Results Eighteen empirical studies identified through the search strategy were included in this review. There was significant heterogeneity in respect to study designs, aims, contexts, debriefing formats, learner characteristics, and data collection instruments. The synthesised findings of this review suggest that, across a range of debriefing outcome measures, in-person self-led debriefings for groups of learners following immersive simulation-based education are preferable to conducting no debriefing at all. In certain cultural and professional contexts, such as postgraduate learners and those with previous debriefing experience, self-led debriefings can support effective learning and may provide equivalent educational outcomes to facilitator-led debriefings or self-led and facilitator-led combination strategies. Furthermore, there is some evidence to suggest that self-led and facilitator-led combination approaches may optimise participant learning, with this approach warranting further research. Reflexive thematic analysis of the data revealed four themes, promoting self-reflective practice, experience and background of learners, challenges of conducting self-led debriefings and facilitation and leadership. Similar to facilitator-led debriefings, promoting self-reflective practice within groups of learners is fundamental to how and why self-led debriefings influence debriefing outcomes. Conclusions In circumstances where simulation resources for facilitator-led debriefings are limited, self-led debriefings can provide an alternative opportunity to safeguard effective learning. However, their true value within the scope of immersive simulation-based education may lie as an adjunctive method alongside facilitator-led debriefings. Further research is needed to explore how to best enable the process of reflective practice within self-led debriefings to understand how, and in which contexts, self-led debriefings are best employed and thus maximise their potential use. Supplementary Information The online version contains supplementary material available at 10.1186/s41077-023-00274-z.


Background
As a medium for deliberate reflective practice, debriefing is commonly cited as one of the most important aspects for learning in immersive simulation-based education (SBE) [1][2][3].Defined as a 'discussion between two or more individuals in which aspects of performance are explored and analysed' ( [4], p., 658), debriefing should occur in a psychologically safe environment for learners to reflect on actions, assimilate new information with previously constructed knowledge, and develop strategies for future improvement within their real-world context [5,6].Debriefings are typically led by facilitators who guide conversations to ensure content relevance and achievement of intended learning outcomes (ILOs) [7].The quality of debriefing is thought to be highly reliant on the skills and expertise of the facilitator [8][9][10][11], with some commentators arguing the skill of the facilitator as the strongest independent predictor of successful learning [2].Literature from non-healthcare industries tend to support this notion, suggesting that facilitators enhance reflexivity, concentration, and goal setting, whilst contributing to the creation and maintenance of psychological safety, leading to improved debriefing effectiveness [12,13].However, this interpretation is not universally held and has been increasingly challenged [14][15][16][17][18].
It is in this context that there has been an emergence of self-led debriefings (SLDs) in SBE.There is currently no consensus definition of SLDs within the literature, with the term encompassing a variety of heterogenous practices, thus causing a confusing narrative for commentators to navigate as they report on debriefing practices.We have therefore defined 'self-led debriefing' as debriefings conducted by the learners themselves without the immediate presence of a trained faculty member.Several reviews have investigated the overall effectiveness of debriefings, with a select few drawing comparisons between SLDs and facilitatorled debriefings (FLDs) as part of their analysis [2-4, 7, 10, 17, 19-22].The consensus from these reviews is that there is limited evidence of superiority of one approach over the other.However, only four of these reviews conducted a critical analysis of the presence of facilitators within debriefings [2,19,20,22].Moreover, in one review [19], a narrow search strategy identified only one study comparing SLDs with FLDs [14].To our knowledge, only one published review has explored SLDs specifically, investigating whether the presence of a facilitator in individual learner debriefings, in-person or virtual, impacted on effectiveness [23].Within these parameters, the review concluded equivalent outcomes for well-designed SLDs and FLDs, however did not explore the influence of in-person SLDs on debriefing outcomes for groups of learners in immersive SBE.The value and place of SLDs within this context, either in isolation or in comparison with FLDs, therefore warrants further investigation.
Within the context of immersive SBE, and in-person group debriefings specifically, we challenge the concept of 'one objective reality' , instead advocating for the existence of multiple subjective realities constructed by individuals or groups.The experiences of learners influence both their individual and group perceptions of reality and therefore different meanings may emerge from the same nominal simulated learning event (SLE) [24].As such, this study has been undertaken through the lens of both constructionism and constructivism, with key elements deriving from both paradigms.Constructionism espouses the profound impact that societal and cultural norms have on determining how subjective experiences influence an individual's formation of meaning within the world, or context, that they inhabit [25,26].Constructivism is a paradigm whereby, from their subjective experiences, individuals socially construct concepts and schemas to cultivate personal meanings and develop a deeper understanding of the world [26,27].In the context of in-person group debriefings, the creation of such meaning, and therefore learning, may be shaped and influenced by the presence or absence of facilitators [24].
The discourse surrounding requirements for facilitators and their level of expertise in debriefings has important implications due to the resources required to support faculty development programmes [2,8,9,28].SLDs are a relatively new concept offering a potential alternative to well-established FLD practices.Evidence exploring the role of in-person SLDs for groups of learners in immersive SBE is emerging but is yet to be appropriately synthesised.The aim of this integrative review (IR) is to collate, synthesise and analyse the relevant literature to address a gap in the evidence base, thereby informing simulation-based educators of best practices.The research question is: with comparison to facilitator-led debriefings, how and why do in-person self-led debriefings influence debriefing outcomes for groups of learners in immersive simulation-based education?

Methods
The traditional perception of systematic reviews as the gold-standard review type has been increasingly challenged, especially within health professions educational research [29].An IR systematically examines and integrates findings from studies with diverse methodologies, including quantitative, qualitative, and theoretical datasets, allowing for deep and comprehensive interrogation of complex phenomena [30].This approach is particularly relevant in SBE, where researchers employ a plethora of study designs from differing theoretical perspectives and paradigms.An IR is therefore best suited to answer our research question and help satisfy the need for new insights such that our understanding of SBE is not restricted [31].
This IR has been conducted according to Whittemore & Knafl's framework [30] and involved the following five steps: (1) problem identification; (2) literature search; (3) data evaluation; (4) data analysis; and (5) presentation of findings.Whilst the key elements of this study's methods are presented here, a detailed account of the review protocol has also been published [24].The protocol highlights the rationale and justification of the chosen methodology, explores the underpinning philosophical paradigms, and critiques elements of the framework used [24].

Problem identification
We modified the PICOS (population, intervention/interest, comparison, outcome, study design) [32] framework to help formulate the research question for this study (Table 1), supplementing the 'comparison' arm with 'context' as described by Dhollande et al. [33].This framework suited our study in which the research question is situated within the context of well-established FLD practices within SBE.Simultaneously, it recognises that studies with alternative or no comparative arms can also contribute valuable insights into how and why SLDs influence debriefing outcomes.

Search strategy
Using an extensive and broad strategy to optimise both the sensitivity and precision of the search, we searched seven electronic bibliographic databases (PubMed, Cochrane, Embase, ERIC, SCOPUS, CINAHL Plus, PsychINFO), up until and including October 2022.The search terms are presented below in a logic grid (Table 2).Using a comparator/context arm minimised the risk of missing studies describing SLDs as what they are noti.e.'without a facilitator' .A full delineation of each search strategy, including keywords and Boolean operators, for each electronic database is available (Additional file 1).Additionally, we conducted a manual search of reference lists from relevant studies and SBE internet resources.We enlisted the expertise of a librarian to ensure appropriate focus and rigour [34,35].

Inclusion and exclusion criteria
Articles were included in this review if their content met the following criteria: (1) in-person debriefings following immersive simulated learning events; (2) debriefings Table 1 PICOS framework [32,33] used to construct research question

Population
In-person immersive SBE debriefing participants

Intervention / Interest
Self-led debriefings

Comparison / context
With or without facilitator-led debriefings

Study design
Integrative: both quantitative and qualitative studies included Forms of grey literature, such as doctoral theses, conference or poster abstracts, opinion or commentary pieces, letters, websites, blogs, instruction manuals and policy documents were excluded.Similarly, studies that described clinical event, individual, non-immersive or virtual debriefings were also excluded.Date of publication was not an exclusion criterion.

Study selection
Following removal of duplicates using bibliographical software package EndNote ™ 20, we screened the titles and abstracts of retrieved studies for eligibility.Full texts of eligible studies were examined.Application of the inclusion and exclusion criteria determined which of these studies were appropriate for inclusion in this IR.We used a modified version of the Preferred Reporting Items for Systematic Review and Meta-Analysis Protocols (PRISMA) reporting tool [36] to document this process.

Data evaluation
The process of assessing quality and risk of bias is complex in IRs due to the diversity of study designs, with each type of design generally necessitating differing criteria to demonstrate quality.In the context of this complexity, we used the Mixed Methods Appraisal Tool (MMAT) which details distinct criteria tailored across five study designs: qualitative, quantitative randomised-controlled trials (RCTs), quantitative non-RCTs, quantitative descriptive and mixed methods [37].

Data analysis
Data was analysed using a four-phase constant comparison method originally described for qualitative data analysis [38,39].Data are compared item by item so that similar data can be categorised and grouped together, before further comparison between different groups allows for an analytical synthesis of the varied data originating from diverse methodologies.These phases include (1) data reduction; (2) data display; (3) data comparison; and (4) conclusion drawing and verification [30,38,39].Following data reduction and extraction, we performed reflexive thematic analysis (RTA) according to Braun & Clarke's [40] framework to identify patterns, themes and relationships that could help answer our research question and form new perspectives and understandings of this complex topic [41].RTA is an approach underpinned by qualitative paradigms, in which researchers have a central and active role in the interpretative analysis of patterns of data and their meanings, and thus subsequent knowledge formation [40].RTA is particularly suited to IRs exploring how and why complex phenomena might exist and relate to one another, as it enables researchers to analyse diverse datasets reflexively.It can therefore facilitate the construction of unique insights and perspectives that may otherwise be missed through other forms of data analysis.A comprehensive justification, explanation and critique of this process can be found in the accompanying IR protocol [24].

Study selection and quality assessment
The search revealed a total of 1301 publications, of which 357 were duplicates.After screening titles and abstracts, 69 studies were identified for full-text screening.From this, a total of 18 studies were included for data extraction and synthesis (Fig. 1).Reasons for study exclusion are listed in Additional file 2. All 18 studies were appraised using the MMAT (Table 3).Five questions, adjusted for differing study designs, were asked of each study, and assessed as 'yes' , 'no' or 'can't tell' .The methodological qualities and risk of bias within individual studies impacted the analysis of their data and the subsequent weighting and contribution to the results of this review.The quality assessment process therefore influences the interpretations that can be drawn from such a collective dataset.Whilst the studies demonstrated varying quality, scoring between 40 and 100% of 'yes' answers across the five questions, no studies were excluded from the review based on the quality assessment.There were wide discrepancies in the quality of different components of the mixed methods studies.For example, Boet et al. [15] scored 0% for the qualitative component and 100% for the quantitative component of their mixed methods study.The quantitative results were therefore weighted more significantly than the qualitative component in the data analysis and its incorporation into the results of this review.Meanwhile, Boet et al. 's [16] qualitative study scored 100%, thus strengthening the influence and contribution of the results from that study within this IR.

Learner characteristics
In total, the 18 studies recruited 2459 learners.Of these, the majority were undergraduate students of varying professional backgrounds: 1814 nursing, 210 medical, 158   on the OGRS, with learners being asked to 'reflect on their CRM performance and on how it could be improved' .
All SLDs and FLDs were audiorecorded and transcribed.Data were analysed qualitatively using a constant comparison method and interpreted using a social constructivist framework.Authors used a consensus-based, iterative, inductive process to identify emergent themes.
(1) 3 emergent themes revealed topics that allowed learners to enter reflection were (1) the debriefing itself, (2) experience of the simulation model (including fidelity), and (3) performance, including assessment of CRM performance.
(2) SLD learners relied heavily on the OGRS form to guide their debriefing.
(3) FLDs followed a more precise structure and had directed conversation, whilst SLDs had more talking over and were less structured than FLDs.

Andrews et al. (2019) [California, USA] [43]
To explore students' perceived value of reviewing video recordings of, and on receiving faculty or peer feedback on, their NTS in an SLE.(2) Observer ratings of learner performance [ATOSCE] (sample 2 only).
(1) Learners rated FLDs significantly higher (p<0.05)than SLDs on most dimensions.However, both SLDs and FLDs were rated highly, with learners finding value regardless of the method.
(2) Perceived value did not differ by age, gender, class year or OSCE performance.
(3) Providing training for peer-feedback did not result in more favourable ratings for SLDs vs. FLDs To determine the effects of written versus observed SLDs when using simulation with CBL and compare levels of satisfaction between the two groups.
3rd-year nursing students (n=69) Written SLD: n=33 Observed SLD: n=36 Following 10-min team scenario, learners commenced either 20-min written SLD (using DD structure) or observed SLD (10 min watching another group undertake a scenario, prior to 10-min SLD to compare their scenario to that of the other group).
Post SLDs, learners repeated the scenario, before completing questionnaires prior to undertaking a FLD.
(1) Observer ratings of learner performance pre-and posttest checklist a (15 items across 6 domains).
( (2) Communication was significantly higher in the observed SLD group vs. written SLD group (p=0.047).There were no significant differences in any other between the 2 groups.

Table 4 (continued)
Authors, year, and location

Data collection instruments and outcome measures
Key reported study findings To evaluate nursing students' knowledge and confidence in preoperative nursing skills and satisfaction with debriefing and multimode simulation using SLDs compared with FLDs.*Reported as a 2-arm quasi-experimental.However, authors report using random allocation for learners.It has therefore been included in this IR as a quantitative RCT.
3rd-year nursing students (n=124) SLD: n=62 FLD: n=62 Following 20-min team scenario with a manikin and SP, learners commenced either 20-min written SLD (using checklist to structure debrief ) or oral FLD (not described in detail).Post debriefings, learners completed questionnaires and a repeat knowledge test 2 weeks later (at which point SLD groups were offered an oral FLD).
(1) Pre-and posttest learner written knowledge assessment a .
( (1) No significant difference in knowledge scores pre-to posttest in both groups (p=0.940).
(2) The self-confidence of preoperative nursing skills was statistically higher in the oral FLD group vs. written SLD group (18.81 vs. 17.85,p=0.010).
(2) Observer ratings of learner performance pre-and posttest checklist a (18 items).
( (1) There were no significant differences in knowledge of preoperative care or self-confidence from pretest to posttest in either of the groups.

SLDs.
(3) There were no statistically significant differences in self-confidence (p=0.686) or knowledge (p=0.445) between the 2 groups.(2) TPA scores were significantly higher for the SLD group (14.5) than the FLD group (13.3) (p=0.001)whereas there was no significant difference in SPA scores (p=0.05).(1) Nursing: the SLD + FLD group had a significantly higher CATS score (15.63) than either the SLD (13.91) or FLD group (13.71) (p< 0.001).There were no significant differences for DES or PSI between the groups.
(2) Physiotherapy: the SLD + FLD group had a significantly higher CATS score (13.50) than either the SLD only group (10.83) or FLD only group (10.36) (p< 0.009).There were no significant differences for DES or PSI between the groups.
(3) OT: there were no significant differences for DES, PSI or CATS between the groups.(1) The SLD + FLD group showed significant improvement in the problem-solving process (p<0.01) and debriefing satisfaction (p=0.02),but not in debriefing assessment (p=0.097) or team effectiveness (p=0.069)compared to the FLD group.

Rueda
(2) Groups participating in a higher number of SLDs had significant improvements in problem-solving ability (p<0.001) and debriefing satisfaction (p<0.001).Furthermore, debriefing assessment scores and team effectiveness tended to increase with the number of SLD sessions, but non-significantly.(2) There was no significant difference in posttest academic self-efficacy between the video assisted SLD group and the SLD and FLD groups (p=0.218).
(2) Nursing students in both groups showed a significantly higher overall cognitive load, higher positive and lower negative achievement emotions, and improved nursing performance after debriefing compared to before debriefing.However, when examining preand posttest score differences between the groups, there were no significant differences across all three measures.To investigate the efficacy of the Q-TAS, versus the TAS, as a formative assessment of teamwork to improve the quality of SLDs.2-arm prospective comparative analysis.
(1) No significant difference in the overall quality of SLDs when using the Q-TAS (4.70) compared with the TAS (4.13) (p= 0.93).
(2) A statistically significant increase in quality of the analysis segment of SLDs when using the Q-TAS (4.92) compared with the TAS (3.83) (p= 0.23).

Schreiber et al. (2020) [Pennsylvania, USA] [55]
To examine the use of SLDs for the purposes of assessing student perception of confidence with learning and performance related feedback during SBE.(1) FLD and SLD + FLD groups had a significantly higher RTI scores vs. SLD groups (p=0.006).
(2) There was no significant difference in the Critical Reflection Self-Efficacy VAS scores (p= 0.201) or the GSES scores (p= 0.933) between the 3 groups.
(3) Debriefing checklist adherence ranged from 10.9 to 92.7% across the 9 items.No data on group adherence to data is presented.
2nd and 3rd-year nursing students (n=509) Following 10-min team scenario in which learners played 4 specific roles (handheld device operator, nurse, physician, and observer), learners commenced 5-min SLD (led by learner playing role of handheld operator, using 3-question approach).Following completion of all 4 scenarios and debriefings, all learners attended a large group 30-min FLD.
(1) Learners self-reported high satisfaction in learning with the SLE (4.42).
(2) Learners reported high self-confidence in clinical skills after exposure to the SLE (4.14).Both groups then undertook a second 10-min CRM scenario, followed by FLDs.Video playback available in all debriefings.SLDs used a form based on the OGRS, with learners being asked to 'reflect on their CRM performance and on how it could be improved' .

Mixed methods studies
(1) Pre-and posttest observer assessment of learners [TEAM].
(2) There was no significant difference in the degree of improvement between the groups (p=0.52).
(3) Use of scenario video playback was similar in both groups.
(4) Similar themes were discussed in both groups.[15,16,49,54], learners worked with their own professional group rather than as part of an interprofessional team.

Self-led debriefing format
The specific debriefing activities, whether SLDs, FLDs or a combination of both, took several different formats and lasted between 3 and 90 min.Most SLDs utilised a written framework or checklist to guide learners through the debriefing, although this was unclear in two studies [42,44].Two studies required learners to independently selfreflect, via a written task, prior to commencing group discussion [49,50].Some studies included video playback within their debriefings [15,16,42,43,[49][50][51][52].

Data collection instruments and outcome measures
In total, 38 different data collection instruments were used across the 18 studies.These are listed along with their components and incorporated scales if described in sufficient detail within the primary study (Table 4).The validity and reliability of these instruments is variable.Indeed, 13 data collection instruments were developed by study authors without data on validity or reliability.Authors used one or more instruments to measure outcomes in five key domains (Table 5).

Key reported findings of studies
There was significant heterogeneity between the designs, aims, samples, SLD format, outcome measures and contexts of the 18 studies, with often conflicting and inherently biased findings due to study designs and outcome measures used.Nine studies reported equivalent outcomes regarding some elements of either debriefing quality, participant performance or competence, selfconfidence or self-assessment of competence and participant satisfaction [15, 45-49, 52, 53, 56].However, of these nine, five also reported that SLDs were significantly less effective if using other elements of the outcome measures [45,46,49,52,56].In addition to these five, two studies reported decreased effectiveness of SLDs in comparison to FLDs or a combination of SLD + FLD [43,50].Conversely, only Lee et al. [52] and Oikawa et al. [48] reported any significant improvements with selected outcome measures with SLDs compared with FLDs, whilst Kündig et al. [47] reported improvements in two performance parameters with SLDs when compared with no debriefing.Four studies investigated using a combination strategy of SLD + FLD and demonstrated either significantly improved or equivalent outcomes compared with either SLDs or FLDs only [49][50][51]56].Kang and Yu [51] reported significantly improved outcomes for problemsolving and debriefing satisfaction, but no differences in debriefing quality or team effectiveness.Other studies reported the opposite with significantly improved team effectiveness and debriefing quality, but no improvements in problem-solving or debriefing experience [49,50].Tutticci et al. [56] reported both significant and nonsignificant improvements in reflective thinking, dependent on which scoring tool was used.These findings, however, are in the context of variable quality appraisal scores (Table 3), wide variation in SLD formats and data collection instruments, and improved outcomes regardless of the method of debriefing used.

Thematic analysis results
We undertook reflexive thematic analysis (RTA) of the data set, revealing four themes and 11 subthemes (Fig. 2).The process of tabulating themes and an exemplar of coding strategy and theme development can be found in Additional files 3 and 4.

Theme 1: Promoting self-reflective practice
The analysis of data revealed that promoting self-reflective practice is the most fundamental component of how and why SLDs influence debriefing outcomes.Debriefings can encourage groups of learners to critically reflect on their shared simulated experiences leading to enhanced cognitive, social, behavioural and technical learning [15, 16, 42, 43, 45-48, 50, 51, 53, 54, 56, 57].Different components within SLDs, including structured frameworks, video playback, and debriefing content, may influence such self-reflective practice.Most authors advocated a printed framework or checklist to help guide learners through the SLD process.However, despite this, SLDs were found to be less structured than FLDs [16].The Gather-Analyse-Summarise framework [59] was most commonly used [46,[49][50][51]53].One study compared two locally developed debriefing instruments, the Team Assessment Scales (TAS) and Quick-TAS (Q-TAS), concluding that the Q-TAS was more effective in enabling the analysis of actions, but equivalent in all other measures [54].
Video playback offered a form of feedback for learners that encouraged reflective processing of scenarios [15,16,52].One article concluded quoting a learner: 'I learned it's worthwhile to revisit situations like this.I know I won't always have video to critique, but being able to rethink through the appointment will be helpful to review which tactics helped and which ones need to be revised' ( [42], p., 929).In such a manner, video playback enables learners to perceive behaviours of which they were previously unaware [15].Whilst many studies lacked interrogation of content within SLDs, Boet et al. [16] provided an extensive analysis, reporting that interprofessional SLDs centred on content such as situational awareness, leadership, communication, roles, and responsibilities.Furthermore, it was through learners' perceived performance of this content that offered entry points into reflection [16].Some studies required learners to document their thoughts and impressions [44,45,47,48,50,53].However, the influence of content documentation on promoting self-reflective practice was inconclusive.
Combined SLD + FLD strategies involved learner and faculty co-debriefing [56], or SLDs preceding FLDs [49][50][51].Using the Reflective Thinking Instrument one study reported FLD and combined SLD + FLD groups demonstrated significantly higher levels of reflective thinking amongst learners compared with SLD groups [56].Within the limitations of a tool with poor validity and reliability, this study provides the best evidence that a combination approach to debriefing groups may be the most beneficial method for encouraging learner critical self-reflection.This finding is supported by results from three other studies showing improved outcomes with combined debriefing strategies, across team effectiveness [49], debriefing quality [50], problem-solving processes [51] and satisfaction with debriefing [50,51].

Theme 2: Experience and background of learners
The experience and background of learners has a profound impact on how and why SLDs influence debriefing outcomes.Previous SBE experience may III.Learner self-confidence or self-assessed competence covering a range of skills and behaviours IV.Learner satisfaction or experience with the simulation or debriefing modality V. Debriefing content via qualitative data analysis using a constant comparison method Fig. 2 analysis map illustrating themes and subthemes significantly impact the ability of learners to meaningfully engage with the SLD process and influences their expectations as to how a simulated scenario will progress [15,16].Furthermore, previous experience with FLDs may positively contribute to rich reflective discussion within SLDs as learners are better placed to integrate FLD goals and processes within a new context [16].Whilst its influence on the conduct of SLDs is less clear, Boet et al. [16] note that real-world clinical experience allows learners to recontextualise their simulated experiences more readily and may therefore act as point into the reflective process.In teams from the same professional background, learners appreciated the value of learning from constructive exchanges of opinion between colleagues operating at the same level [42,44,45], and role-modelling teamwork behaviours [48], whilst interprofessional SLDs may help break down traditional working silos, and support learning in contexts that replicate clinical practice [15].Finally, learners originated mainly from either South Korea or North America.Cultural differences between Korean and Western learners may affect debriefing practices, with Korean students being described as less expressive than their Western colleagues [46].The impact of cultural diversity on SLD methods, however, was not specifically investigated [44,46,53].

Theme 3: Challenges of conducting SLDs
Challenges of conducting SLDs were constructed from the dataset, including closing knowledge gaps, reinforcement of erroneous information, and resource allocation.The absence of expert facilitators may present a missed learning opportunity, whereby erroneous information could be discussed and consolidated, thus negatively affecting subsequent performance [44,45,47,51] and potentially persisting into clinical practice [46].There was consistent student preference for FLDs over SLDs which may indicate learners seeking expert reassurance and accurate debriefing content not readily available from peers [43,50].By reducing the requirement for expensive faculty presence, a significant motivating factor for investigating and employing SLDs is the potential for reducing costs [15, 16, 44-46, 49, 57].However, SLDs do not appear to negate the need for faculty presence completely, but rather limit their role for specific elements within a SLE [15,16].Furthermore, the most influential impact on debriefing outcomes may be the incorporation of SLDs in combination with, rather than at the expense of, FLDs [49][50][51]56].Finally, most articles integrated a FLD-element within their SLE, thereby negating positive impacts on resource allocation [15, 16, 42-46, 49-51, 54-57].

Theme 4: Facilitation and leadership
The facilitation and leadership of SLDs may have a considerable impact on how and why SLDs influence debriefing outcomes.Only five articles described how learners were allocated as leaders and facilitators of SLDs [43,[54][55][56][57]. Random allocation of learners to lead and facilitate SLDs occurred either prior to, or on the day of the SLE [54][55][56].In two studies, learners took turns leading the debrief such that all learners facilitated at least one SLD [43,57].No articles discussed the influence of leadership and facilitation on learners, nor the learners' reactions, thoughts, or feelings towards the role or the content and reflective learning with subsequent debriefings.In two articles describing the same learner sample, only one of 17 interprofessional SLDs was nurse-led, all others being led by a medical professional [15,16].Such situations may have unintended implications by reinforcing stereotypes and hierarchical power imbalances.
Learners were trained to lead the SLDs in only two studies.In one, learners were randomly allocated to lead the SLDs, and were directed to online resources, including videos, checklists, and relevant articles, to help prepare for this role prior to the SLE [56].No information concerning learners' engagement with the resources was documented.In another study, learners were given 60 min training on providing constructive feedback to peers, which did not lead to improved outcomes for debriefing quality, performance, or self-confidence [43].

Discussion
The aim of this IR was to collate, synthesise and analyse the relevant literature to explore, with comparison to FLDs, how and why in-person SLDs influence debriefing outcomes for groups of learners in immersive SBE.The review identified 18 empirical studies with significant heterogeneity in respect to designs, contexts, learner characteristics, and data collection instruments.It is important to recognise that the review's findings are limited by the variety and variability in quality of the data collection instruments and debriefing outcome measures used in these studies, as well as by some of the study designs themselves.Nevertheless, the findings of this review suggest that, across a range of debriefing outcomes, in situations where resources for FLDs are limited, SLDs can provide an alternative opportunity to safeguard effective learning.In some cultural and professional contexts, and for certain debriefing outcome measures, SLDs and FLDs may provide equivalent educational outcomes.Additionally, a small cohort of studies suggest that combined SLD + FLD strategies may be the optimal approach.Furthermore, SLDs influence debriefing outcomes most powerfully by promoting self-reflection amongst groups of learners.

Promoting self-reflection
Aligned with social constructivist theory [80], the social interaction of collaborative group learning in a reflective manner can lead to the construction, promotion and sharing of a wide ranging of interpersonal and team-based skills [81,82].Currently, there is a lack of evidence concerning which frameworks are best suited to maximise such reflection [10], especially in SLDs.Whilst framework use is associated with improvements in debriefing quality and subsequent performance, some evidence suggests that, in terms of practice, the specific framework itself is of less importance than the skills of the facilitator using it and the context in which it is applied [7,9,10].In SLDs, there is no facilitator to guide this process, and as such, one may infer that the framework itself may have relatively more influence on debriefing outcomes and the reflective process of learners when compared with their use in FLDs.Conversely, whichever framework is used, the quality of the SLDs were rated highly, implying that it may be the structure provided by the framework, as opposed to the framework content, that is the critical factor for promoting reflection.Based on the findings of their qualitative study in which self-reflexivity, connectedness and social context informed learning within debriefings, Gum et al. [83] developed a reflective conceptual framework rooted in transformative learning theory [84], which purported to enable learners to engage in critical discourse and learning.By placing learners at the centre of their model, and by focusing on the three themes previously mentioned, this framework seems suited to groups of learners in SLDs.However, like many other debriefing frameworks, it remains untested in SLD contexts.In a study of business students, Eddy et al. [85] describe using an online tool that captured and analysed individual team members' perceptions of an experience anonymously.The tool then prioritised reported themes to create a customised guide for teams to use in a subsequent in-person group SLD.The study reported that using this tool resulted in superior team processes and subsequent greater team performance when compared to SLDs using a generic debriefing guide only.Such tools may have a place in promoting self-reflection in healthcare SBE, such as with postgraduate learners with previous experiences of debriefings or those who have undertaken training in debriefing facilitation.
Furthermore, other structures or techniques that may help influence and promote self-reflection amongst groups of learners in SLDs are, as yet, untested in this context.For example, SLDs could take the form of in-person or online post-scenario reflective activities, in which learners work collaboratively on pre-determined tasks that align to ILOs.Examples such as escape room activities in SBE, in which learners work together to solve puzzles and complete tasks through gamified scenarios, have used concepts grounded in self-determination theory [86], with promising results in terms of improving selfreflection and learning outcomes [87,88].Meanwhile, individual virtual SLD interventions, rooted in Kolb's experiential learning theory [89], have been tested and purport to enable critical reflection amongst users [90,91].Whilst such approaches may be relatively resourceintensive to create, they could be applied to SLDs for groups of learners in immersive SBE and prove resourceefficient once established.

Video playback
In both individual and group SLD exercises, video playback can allow learners to self-reflect, analyse performance, minimise hindsight bias, and identify mannerisms or interpersonal behaviours that may otherwise remain hidden [15,42,52,[92][93][94][95].These findings are supported by situated learning theory whereby learning can be associated with repeated cycles of visualisation of, and engagement with, social interactions and interpersonal relationships which enable co-construction of knowledge amongst learners [96].Conversely, in group SLD contexts, watching video playback may have unintended consequences for psychological safety, making learners feel self-conscious and anxious, and impact negatively on their ability to meaningfully engage with reflective learning [93].A systematic review concluded that the benefits of video playback are highly dependent on the skill of the facilitator rather than the video playback itself [95], and as such its role influencing debriefing outcomes in SLDs remains uncertain.

Combining self-led and facilitator-led debriefings
The findings of this review suggest that employing combinations of SLDs and FLDs may optimise participant learning [49][50][51]56], whilst acknowledging that this may also be dependent on other variables such as the expertise of debriefers and contexts within which debriefings occur.Whilst the reported improved outcomes are situated in the context of in-person SLDs for groups of learners, they are supported by the wider literature.For example, a Canadian research group investigated combined in-person and virtual individual SLD formats with FLDs, reporting improved debriefing outcomes across multiple domains including knowledge gains, self-efficacy, maximising reflection, and debriefing experience [90,[97][98][99].SLD components of the combined strategy enable learners to reflect, build confidence, identify knowledge gaps, collect, and organise their thoughts and prepare for group interaction prior to a FLD [90,[97][98][99].However, limitations of these studies include the unreliability of outcome measures.

Facilitation and leadership
Only two studies provided training for learners in how to facilitate debriefings and provide constructive feedback [43,56].This is surprising given the emphasis of faculty development in the SBE literature [6,9,28,100].RTA of the data highlighted how the potential influence of previous experience with FLDs may influence learners' ability to actively engage in the reflective nature of the SLD process [15,16].This brings into question whether learners should have some familiarity of debriefing processes, either via previous experience or targeted training, prior to facilitating group SLDs.
Variables such as learners' debriefing experiences and educational context have implications for the interpretation of the findings of this review.Having previous experience with FLDs may potentially influence learners' abilities to actively engage in the reflective nature of the SLD process [15,16] bringing into question whether learners should have some familiarity of debriefing processes, either via prior experience or targeted training, prior to being expected to facilitate or lead a group SLD.This further raises questions about whether SLDs may or may not be more suitable for certain populations, such as students undergoing early training or postgraduates who are relatively more experienced in SBE.Training peers as facilitators, who then act in an 'instructor' role, rather than as part of the learner group, has also been reported as an effective method to positively influence debriefing outcomes [101,102].However, training learners to facilitate SLDs involves significant resource commitments, thus negating some of the initial reasons for instigating SLDs.

Data collection instruments and outcome measures
The studies included in this review used multiple data collection tools to gauge the influence of SLDs on debriefing outcomes across five domains (Table 5).The diversity in approaches to outcome measurement is problematic as it impedes the ability to compare studies fairly, effectively, and robustly [103].Certain instruments, such as the Debriefing Assessment for Simulation in Healthcare-Student Version [68] and the Debriefing Experience Scale [69], are validated and reliable tools for assessing learner perceptions of, and feelings towards, debriefing quality in certain contexts.However, learner perceptions of debriefing quality do not necessarily translate to objective evaluation of debriefing practices.Additionally, some studies relied on learner self-confidence and self-reported assessment questionnaires for their outcome measures, despite self-perceived competence and confidence being a poor surrogate marker for clinical competence [104].Commonly used tools measuring debriefing quality may not be suitable for SLDs and having a 'one-size-fits-all' approach could invalidate results [105].To our knowledge, there is no validated or reliable tool currently available that specifically assesses the debriefing quality of SLDs.

Psychological safety
One important challenge of conducting SLDs, which was not constructed through the RTA of this dataset, is ensuring psychological safety of learners in debriefings.Psychological safety is defined as 'a shared belief held by members of a team that the team is safe for interpersonal risk taking' ( [106], p., 350) and its establishment, maintenance and restoration in debriefings is of paramount importance for learners participating in SBE [107,108].Oikawa et al. [48] stated that 'self-debriefing may augment reflection through the establishment of an inherently safe environment' ( [48], p., 130), although how safe environments are 'inherent' within SLDs is unclear.Tutticci et al. [56] quote secondary sources [83,109] stating that peer groups can improve collegial relationships and engender safe learning environments that improve empathy whilst reducing the risk of judgement.Conversely, it may also transpire that psychologically unsafe environments are fostered, leading to unintended harmful practices.In interprofessional contexts where historical power imbalances, hierarchies and professional divisions can exist [11,110,111], and in which facilitator skill has been the most frequently cited enabler of psychological safety [112], one can infer that threats to psychological safety may be accentuated in SLDs.
In contrast, researchers found the process of engaging in an individual SLD enhanced psychological safety by helping learners decrease their stress and anxiety, thus leading to more active engagement and meaningful dialogue in subsequent FLDs [99].Another study reported learners describing the familiarity of connecting with known peers within SLDs fostered psychological safety and enabled learning [98].However, these studies were excluded from this review due to having individual rather than group SLDs.Nevertheless, their findings that combined SLD + FLD strategies enable psychological safety may partially explain the findings of this review, and psychological safety may therefore be a central concept in understanding how and why SLDs influence debriefing outcomes.
For teams regularly working together in clinical contexts, their antecedent psychological safety has a major influence on any SLEs they undertake [113].This subsequently impacts on how team members, both individually and collectively, experience psychological safety within their real clinical environment [113].The place of SLDs in such contexts, along with their potential advantages and risks, remains undetermined.

Limitations
This review specifically investigates in-person group debriefings, and therefore, the results may not be applicable to individual or virtual SLD contexts.The inclusion criteria allowed for published peer-reviewed empirical research studies in English, excluding grey literature.This may introduce bias with some evidence suggesting that excluding grey literature can lead to over-exaggerated conclusions [114,115], and concerns regarding publishing bias [116].We also acknowledge that the choices made in constructing and implementing our search strategy (Additional file 1) may have impacted the total number of articles identified for inclusion in this review.Finally, the heterogeneity of the included studies limits the certainty with which generalisable conclusions can be made.Conversely, heterogeneity enables a diverse body of evidence to be analysed and better informs the need for future research and where gaps may lie.

Recommendations for future research
The findings of this review have highlighted several areas requiring further research.Firstly, the role of combining group SLDs with FLDs should be explored, both quantitatively and qualitatively, to explain its place within immersive SBE.Secondly, to inform best practice, different methods, structures and frameworks of group SLDs need investigating to assess what may work, for whom and in which context.This extends to further research investigating different groups, such as interprofessional learners, to ascertain if certain contexts are more suitable for SLDs than others.Such work may feed into the production of guidelines to help standardise SLD practices across these differing contexts.Thirdly, assessment and testing of data collection instruments is required, as current tools are not fit for purpose.Clarification of what is suitable and measurable in terms of debriefing quality and learning outcomes, especially in relation to group SLDs, is needed.Finally, whilst research into fostering psychological safety in FLDs is emerging, the same is not true in the context of SLDs and this needs to be explored to ensure that SLDs are not psychologically harmful for learners.

Conclusions
To our knowledge this is the first review to explore how and why in-person SLDs influence debriefing outcomes for groups of learners in immersive SBE.The findings address an important gap in the literature and have significant implications for simulation-based educators involved with group debriefings across a variety of contexts.The synthesised findings of this review suggest that, across a range of debriefing outcome measures, in-person SLDs for groups of learners following immersive SBE are preferable to conducting no debriefing at all.In certain cultural and professional contexts, such as postgraduate learners and those with previous debriefing experience, SLDs can support effective learning and may provide equivalent educational outcomes to FLDs or SLD + FLD combination strategies.Furthermore, there is some evidence to suggest that SLD + FLD combination approaches may optimise participant learning, with this approach warranting further research.
Under certain conditions and circumstances, SLDs can enable learners to achieve suitable levels of critical self-reflection and learning.Similar to FLDs, promoting self-reflective practice within groups of learners is the fundamental method of how and why SLDs influence debriefing outcomes because it is through this metacognitive skill that effective learning and behavioural consolidation or change can occur.However, more work is required to ascertain for whom and in what contexts SLDs may be most appropriate.In situations where resources for FLDs are limited, SLDs may provide an alternative opportunity to enable effective learning.However, their true value within the scope of immersive SBE may lie as an adjunctive method alongside FLDs.

Table 4
Overview and characteristics of included studies

Authors, year, and location Stated study aim and research design Participant and sample characteristics Description of SLE and SLD activity Data collection instruments and outcome measures Key reported study findings Qualitative studies
or FLD.Both groups then undertook a second 10-min CRM scenario, followed by FLDs.Video playback available in all debriefings.SLDs used a form based

Table 4 (continued) Authors, year, and location Stated study aim and research design Participant and sample characteristics Description of SLE and SLD activity Data collection instruments and outcome measures Key reported study findings
[54]e et al. (2021)[Louisiana, USA][54]

Table 5
Outcome measures I. Debriefing quality (assessed either by learners themselves or by observers) II.Individual or group performance or competence (assessed by observers rating skills, knowledge, or behaviours)