Debriefers have the challenging and rewarding task of guiding simulation participants in their post-experience reflection—both by affirming good behaviours and facilitating the remedy of shortfalls in performance [6, 22]. A debriefer’s ability to guide participants plays an important role in the delivery of simulation. In this observational study the striking findings included a predominance of debriefers talking more than participants (Fig. 2), significantly higher DASH scores provided by participants compared with those self-rated by debriefers and higher participant DASH scores for the debriefers who talked less. In addition, we observed a high level of debriefer satisfaction in using basic quantitative data (timing and diagrams) as an aid to providing feedback. We have structured the following discussion based on the three objectives outlined in the background section.
Can real-time quantitative debriefing performance data be used for feedback?
This study assessed the use of timing data and conversational diagrams. Debriefers receiving feedback based on this data rated its ‘usefulness’ as ų = 4.6 on a 5-point Likert scale. This is an encouraging finding. While it does not guarantee translation into better debriefing, in other settings data-driven feedback has been shown to significantly improve performance [2, 23]. This study was interrupted by the recent COVID-19 pandemic leading to an under-recruitment of debriefings (n = 12), yet we were still able to observe a broad range of interdisciplinary simulation participants and 7 debriefers across 2 SBME sites (Table 1). This suggests that results can be extrapolated to other locations.
Regarding the use of timing data, we present the results for individual times and ratios of contributions of debriefers versus simulation participants (Fig. 2). While the timing data set is interesting within the boundaries of study conditions, it is unclear if this would be practical to measure in typical simulation environments due to resource constraints. It is also unclear whether individual timing data is useful to the debriefers receiving feedback or whether timings reflect quality. For example, knowing an individual talked for a certain period does not necessarily reflect the usefulness of the content, nor the appropriateness of the contribution for promoting reflection. Within these limitations, in using the data for feedback we found it easy to start meaningful conversations with the debriefers about improving future performance [14]. For example, the data on timing allowed discussion of how to involve quieter participants, and how to increase the number of questions that encouraged reflection rather than ‘guess what I am thinking’. While the availability of timings and diagrams appeared to help with feedback, this information arguably may also have been provided using direct observation alone.
From a practical standpoint, we recommend for measuring timing data that a chess clock would be sufficient. A chess clock can provide a simplified binary division of simulation participant and debriefer contributions and would be more practical than our tested method. This approach could provide an estimation of how much the debriefer is talking compared to the participants [6]. With this in mind, from the study findings we note that many debriefings appear to fit a ‘sage on the stage’ category. This is evidenced by 9/12 debriefings in which facilitators talked for equal or longer than the simulation participants. This important finding may be explained by the increasing requirement of multiple hats to be worn by simulation educators or by a lack of training in our debriefer cohort. Debriefers may revert into more familiar approaches to teaching during debriefings such tutoring, explanations and lecturing [24]. To address this problem, timing data could help shape future behaviour. Of interest, in a concurrent study we are also investigating the use of cumulative tallies of debriefer questions, debriefer statements and simulation participant responses. In a similar way to using the chess clock approach for timing, this approach may provide an easy to measure method of estimating the debriefer inclusivity.
In regard to the conversational diagrams, these illustrations were used concurrently with the timing data (Fig. 2) for feedback to debriefers. These diagrams were described by Dieckmann et al. in terms of typical roles in SBME, as well as Ulmer et al who described a variety diagram shapes according to culture [7, 20]. We divided the debriefings in terms of those where the debriefer(s) spoke more than or equal to simulation participants (n = 9) and events where the debriefer(s) spoke less (n = 3). Using this binary division as a basis for analysis, we observed a pattern in the corresponding shapes of the diagrams (Fig. 2). Similar appearances and shape patterns were reported in Dieckmann and Ulmer’s papers [7, 20]. However, on close inspection of each diagram obtained, we could not find the triangular pattern described by Dieckmann et al. The triangle pattern (Fig. 1) is suggestive of 2 or 3 participants dominating. An absence of this pattern was surprising as the range of contribution lengths varied widely (Fig. 2) with some participants not talking at all and some participants talking for > 6 min. This finding could be due to errors in diagram drawings or random chance. Future studies could avoid any uncertainty in this area by analysing debriefings carefully with use of video and professional transcription.
The astute reader would note that medical students contributed less in larger debriefings (i.e. cases 7, 8 and 11) and nurses contributed less than physicians in mixed groups (i.e. cases 10 and 12). This important observation reminds us of the importance of ensuring a simulation learning climate that feels safe for all, and that the topics chosen for discussion in the debriefing are of interest to all [25,26,27]. In this study, the majority of recorded interactions were between the debriefer and simulation participants. Very few interactions were recorded between the participants—an important omission—which may represent a target for our own approaches to faculty development at a local level.
In summary, the simulation literature outlines an array of behavioural characteristics exhibited by debriefers that can promote improved future performance [6]. Existing assessment tools such as DASH have an established role in identifying these characteristics. Use of timing data and conversational diagrams may represent an adjunct which may help debriefers understand their performance, track changes over time and assist supervisors in providing debriefer feedback.
How does quantitative debriefing performance data compare to existing tools?
Existing debriefing assessment tools such as DASH have pros and cons that have been briefly described in the background section. In this study DASH scores were provided by all debriefers and simulation participants. While this was not the primary outcome, it shines a light on the limitations of DASH use. Of note, the 7 debriefers rated themselves significantly lower than the scores from the simulation participants for all DASH elements. These findings reflect our personal experience of using DASH. Prior to the study we had also observed debriefers underscoring themselves and simulation participants overscoring. This finding is interesting, and may be explained by the phenomenon of ‘response bias’, which is reported as a problem of assessment scales and surveys [28, 29]. Variation in DASH scores between raters, as well as the time that DASH takes to complete, may reflect the relative subjectivity of the scores provided and limit its value for debriefer feedback [14]. Furthermore, neither the DASH nor OSAD provide specific measurable goals for new debriefers to target in their next debriefing. Therefore, we suggest a continued use of DASH for highlighting ideal behaviours with supplementation of the various quantitative data tools we have outlined in this paper.
What is the potential role of these findings in the development of debriefers?
As stated, the recipe for success for debriefer faculty development may not come from a single approach. In this study, we found the availability of both quantitative and qualitative data was useful. Experience of using timing data and diagrams together was generally positive, but recording the data and applying this approach was resource intensive. Moreover, the recent pandemic has limited SBME in-person interactions, making current applicability questionable. In the context of the current remote learning climate, a recent study recognised that current methods of faculty development lack a structured approach [30]. We agree that structure is clearly an important factor that faculty development programmes might lack [11]. The quantitative approaches described in our work may assist with providing this structure at the point of care by allocating our attention to observing debriefings in a focused manner. The approaches described should not supercede local planning, adequately resourced and culturally sensitive debriefer faculty development [11, 30].
In terms of other solutions to a relative lack of structure in faculty development programmes, some experts have proposed the use of DebriefLiveⓇ. This is a virtual teaching environment that allows any debriefer to review their performance [30]. Using this (or similar) software could allow debriefers to observe videos, rate themselves and track progress. In view of the current need for social distancing and the use of remote learning, video review may be an alternative to use of the paper forms and real-time feedback that we used [31,32,33].