Eye tracking metrics and leader’s behavioral performance during a post-partum hemorrhage high-fidelity simulated scenario

Background The use of eye tracking in the simulated setting can help improve our understanding of what sources of information clinicians are using as they deliver routine patient care. The aim of this simulation study was to observe the differences, if any, between the eye tracking patterns of leaders who performed best in a simulated postpartum hemorrhage (PPH) high-fidelity scenario, in comparison with those who performed worst. Methods Forty anesthesia trainees from the University of Catania Medical School were divided into eight teams, to enact four times the same scenario of a patient with postpartum hemorrhage following vaginal delivery. Trainees who were assigned the leader’s role wore the eye tracking glasses during the scenario, and their behavioral skills were evaluated by two observers, who reviewed the video recordings of the scenarios using a standardized checklist. The leader’s eye tracking metrics, extracted from 27 selected areas of interest (AOI), were recorded by a Tobii Pro Glasses 50 Hz wearable wireless eye tracker. Team performance was evaluated using a PPH checklist. After completion of the study, the leaders were divided into two groups, based on the scores they had received (High-Performance Leader group, HPL, and Low-Performance Leader group, LPL). Results In the HPL group, the duration and number of fixations were greater, and the distribution of gaze was uniformly distributed among the various members of the team as compared with the LPL group (with the exception of the participant who performed the role of the obstetrician). The HPL group also looked both at the patient’s face and established eye contact with their team members more often and for longer (P < .05). The team performance (PPH checklist) score was greater in the HPL group (P < .001). The LPL group had more and/or longer fixations of technical areas of interest (P < .05). Conclusions Our findings suggest that the leaders who perform the best distribute their gaze across all members of their team and establish direct eye contact. They also look longer at the patient’s face and dwell less on areas that are more relevant to technical skills. In addition, the teams led by these best performing leaders fulfilled their clinical task better. The information provided by the eye behaviors of “better-performing physicians” may lay the foundation for the future development of both the assessment process and the educational tools used in simulation. Trial Registration Clinical.Trial.Gov ID n. NCT04395963.


(Continued from previous page)
Conclusions: Our findings suggest that the leaders who perform the best distribute their gaze across all members of their team and establish direct eye contact. They also look longer at the patient's face and dwell less on areas that are more relevant to technical skills. In addition, the teams led by these best performing leaders fulfilled their clinical task better. The information provided by the eye behaviors of "better-performing physicians" may lay the foundation for the future development of both the assessment process and the educational tools used in simulation. Trial Registration: Clinical.Trial.Gov ID n. NCT04395963.
Keywords: High-fidelity scenario, Behavioral skills, Eye tracking Background Eye tracking is the process of measuring eye movements, using a device called an eye tracker, to register attention behavior while performing a task.
The principle underlying the use of eye tracking technology is the "eye-mind" theory [1], that suggests there is a relationship between where the individuals look (fix their gaze) and the insight into the cognitive processes that guide this scanning, essentially what they are paying attention to or thinking about at that point in time.
Although cognitive processes are complex, and it is possible that an individual may be simultaneously fixating on one thing but thinking about something else, studies have demonstrated that an individual's attention and thoughts at a given point in time most likely correspond to their point of fixation [1,2].
In the medical field, perception and visual expertise have a high impact on work efficiency and effectiveness, as well as on the correctness of analysis and diagnosis [3].
Eye tracking is able to provide reliable quantitative data, which can be interpreted to give an indication of clinical skill, provide training solutions, and aid in feedback and reflection.
Overall eye tracking methodology has contributed significantly to training assessment and has been used in simulation practices in the attempt to better understand insights into how data are collected and acted on during high-load cognitive processes [4][5][6][7][8].
The use of eye tracking in the simulated setting can help improve our understanding of what sources of information clinicians are and are not using as they deliver routine patient care. Using these data, one can then identify what sources of information individuals who are not making errors are using and in what sequence these individuals are gathering that information.
The aim of this simulation study was to observe the differences, if any, between eye tracking metrics of leaders who performed best in a simulated postpartum hemorrhage (PPH) high-fidelity scenario, in comparison with those who performed the worst.

Method
The authors registered the study at Clinical.Trial.Gov (ID n. NCT04395963).
Forty PGY4-5 anesthesia trainees volunteered from the University of Catania Medical School to be enrolled in this prospective, observational study. Each participant gave informed written consent, and privacy, confidentiality, and anonymity were fully guaranteed by the EESOA Research Board.
In our region, simulation centers do not have access to a formal ethical approval process.
Our simulation center adheres and follows the Healthcare Simulationist Code of Ethics supported by the Society for Simulation in Healthcare [9]. Our study was eligible for exemption, in accordance with US Federal Human Subject Regulations-Protection of Human Subjects, due to the nature of the study itself, as no patients were involved, the trainees participating were volunteers, the researchers ensured that those taking part in the research would not be caused distress. All participants' personal and other data were completely anonymized, and all the investigators had no conflict of interest and were not involved in any of the participants' university teaching programs.
We studied eight teams, each containing five participants, and the scenario was repeated four times in order to note any difference in behavioral performance in the participants who were given the role of leader. Our study was a typical "high-fidelity simulation with role exchange" [10]. It is well-known how exchanging professional roles helps professionals understand and "put themselves in the shoes" of their colleagues. This technique has a high didactic value as it trains trainees to better understand the points of view of other healthcare professionals participating in the emergency. For the purpose of the study, we randomly assigned the "leader's role" to the same subject and rotated the others during the four scenarios (assigning in turn the roles of midwife, obstetrician, nurse, and anesthesia trainee) in such a way that at the end of the rotation, each of them had participated with a different role.
For the leader's role, we selected those who had the most experience in obstetric anesthesia, based on the time spent in the delivery room during their curricular rotations. Thereafter, we randomly gave them the role of "senior anesthesiologist" in each scenario, making sure that among the different roles assigned, the one of "senior" was the only one they interpreted. In this way, we expected that the participant who was assigned the role of "senior anesthesiologist" (and who in fact had the most experience in obstetric anesthesia) would take the leadership. In the case of shared leadership with someone else, the case was not included in the study.
Every trainee who was assigned the leader's role wore the eye tracking glasses during the scenario.
For this study, we used a commercially available Tobii Pro Glasses 50 Hz wearable wireless eye tracker. This system can measure eye movements using cameras integrated into the eyeglasses which record the corneal reflection of infrared lighting to track pupil position, mapping the subject's focus of attention on video recordings of the subject's field of vision (gaze). In addition to tracking gaze, this system also enables the measurement of various eye metrics including fixation frequency and dwell time, used as a surrogate measure of perceived stimulus importance [11].
All the eye-tracked procedures were recorded immediately after accurate individual calibration, during which the participant, after wearing the glasses unit, focused on the center of the calibration target.
All the eye tracking video recordings were stored and analyzed using the Tobii Pro Lab Software. We selected 27 areas of interest (AOI) (Figs. 1 and 2), to define regions of a displayed stimulus and to extract metrics specifically for those regions as follows: Eleven AOI concerning the simulation room: airway suction, anesthesia trolley, blood loss, drip phleboclysis, ECG monitor, clock, oxygen source, control room, phone, blood loss collector bag, and other (the space in general) (Figs. 1 and 2) Eight AOI concerning the participants: anesthesia trainee, anesthesia trainee's eyes, obstetrician, obstetrician's eyes, nurse, nurse's eyes, midwife, midwife's eyes (Fig. 2) Eight AOI concerning the manikin: right arm (on which the sphygmomanometer was placed), the left arm (on which two intravenous accesses were set), right leg, left leg, belly, trunk, vagina, face (Fig. 3).
The number and duration of fixations for each area of interest were examined.
The fixation points were generated by the Tobii software's filter according with the following parameters: max gap length 75 ms; noise reduction: window size (samples) 3; velocity calculator: window length 20 ms; merge adjacent fixations: max time between fixations 75 ms; max angle between fixations 0.5°; minimum fixation duration 60 ms.
Each fixation point was assigned manually to a specific AOI by an independent, blinded investigator (simulation technician specifically trained in eye tracking) who reviewed the video recording of each scenario.
The eye tracking metrics were mapped as gaze plots and heat maps. Heat maps and gaze plots are data visualizations that can communicate important aspects of visual behavior clearly and with great power. Gaze plots Heat maps show how looking is distributed over the stimulus and can effectively reveal the focus of visual attention.
The evaluation of the behavioral skills of the leader and of the technical skills of the team was made by two expert observers not involved in the scenarios, who independently reviewed the video recordings of the scenarios.
The technical skills of the team were evaluated on the basis of the completion of a PPH checklist. For the design of this checklist, we reviewed PPH guidelines from recognized obstetric bodies and literature, relevant papers from the literature, and their institutional PPH protocol [12][13][14][15].
We then chose, by consensus, the final action items for the checklist, identifying 25 standardized key tasks for inclusion on the PPH checklist. We assigned one point for each task executed, for a maximum of 25 points. This checklist worked as the reference guide for pre-scenario briefing and for the team's technical skills evaluation during the scenario (Appendix 1).
We also developed a standardized questionnaire for the evaluation of the behavioral skills of the leader (Appendix 2), derived from the Anaesthetists' Non-Technical Skills (ANTS) behavioral marker system [16] and the Ottawa Global Rating Scale (GRS) [17]. Each independent observer assigned a score for leadership, communication, and situational awareness (Appendix 2).
Interobserver reliability was also calculated No formal training took place before the first scenario, in order to consider the first scenario as the participants' baseline performance. All the teams underwent standardized educational training on PPH Guidelines immediately before the second, third, and fourth scenarios.
The scenario consisted of a severe PPH (> 1500 mL blood loss) due to refractory uterine atony in a multiparous 28-year-old patient who had undergone a spontaneous vaginal delivery. The patient became tachycardic and hypotensive consistent with hemorrhagic shock. All  simulations were performed in the simulation room of the EESOA Simulation Center (Rome) using a highfidelity manikin (Sim Mom Maternal and Neonatal Birthing Simulator, Laerdal, Norway). All scenarios were videotaped. The scenario was stopped when each team had completed all 25 tasks of the checklist, or when 15 min had elapsed. Each scenario was followed by a standardized debriefing led by an expert debriefer.
A study investigator, expert in both PPH and simulation debriefing and not involved in the simulation activity, observed each scenario in the control room, to record and check the team's performance (PPH evaluation and treatment), according to the established 25 PPH key tasks (Appendix 1). The leaders' behavioral scores (Appendix 2) were assigned by two observers, experts in communication and evaluation in simulation and not involved in the scenarios, who reviewed the videos of each simulation.
After completion of the study, all the leaders were divided into two groups, depending on the scores received for their leadership behavioral skills during the scenarios, and their eye tracking metrics were compared. We divided all the leadership situations into two groups: the High-Performance Leader group, HPL, which included all the leaders who had received the highest scores (score = 5) and the Low-Performance Leader group, LPL, which included the leaders who had received the lowest scores (score: 1-2) during the four scenarios.

Statistical analysis
Data are presented as means, confidence intervals (95% CI), and standard deviations (SD).
The leaders' performances were calculated by using the means of the scores given by each independent observer on leadership, communication, and situational awareness (Appendix 2).
In order to better discriminate the best and worst performance assessment, a linear transformation to convert the scale into a 5-point scale was used.
The eye tracking metrics were compared by using a two-way unpaired t test with lower and higher alternative hypothesis to compare the two groups.
The overall team performance (assessed by the PPH checklist) from the first to the fourth scenarios was examined by using the ANOVA test and Dunnett's post hoc test.
It was not possible to calculate the sample size a priori because at the start of the study, it was obviously unreasonable to determine how many leaders would perform well or poorly.
The post hoc power analysis, set at a significance level of 0.95 and a calculated effect size for almost all comparison above 1, was in the range of 60-80% power.
The Cohen's Kappa coefficient was applied to measure the degree of agreement between the two assessors (inter-rater reliability).

Results
All the participants successfully completed the scenarios. Out of a total of 32 planned "leadership situations" (performed by each leader of the eight groups, playing 4 scenarios), in eight situations, the leaders received the best scores (score = 5) (HPL group) and in seven, the leaders received the lowest scores (1-2) (LPL group). All the other episodes that had an intermediate score, including the cases of shared leadership, or in the case of the leadership being taken by another participant, were not considered for the data analysis. There was a high level of concordance (k = 0.92) between the two observers who independently made the evaluations. In Table 1 and Fig. 4, the duration and number of fixations of the leaders on each of the participants in the scenario are reported.
In the HPL group, the average duration and number of fixations were greater, and the distribution of gaze was uniformly distributed among the various members of the team as compared with the LPL group (with the exception of the participant who performed the role of the obstetrician).
The leaders who performed better established eye contact with their team members more often and for longer (with the exception of the participant who performed the role of the obstetrician) than the leaders who performed worse.
In Table 2 and Figs. 5 and 6, the number and duration of fixations of the leader on the simulation room and manikin are reported. In Figs. 7 and 8, respectively, the heat maps of the highly performing leaders and of the low-performing ones are reported.
The leaders who performed worse had more and/or longer fixations on some technical areas of interest, such as the right arm (on which the sphygmomanometer was placed) (P < .05), the left arm (on which two intravenous accesses were set) (P < .05), and the space in general (P < .05). They also fixed their gaze on the anesthesia cart for a longer time (P < .05). The number of fixations of the patient's face was greater in the leaders who performed better (P < .05).
The overall team performance score was greater in the HPL group (P < .001).
Interestingly, eye tracking video recordings allowed us to notice that no leader looked at the glass window of the control room, at the microphones or at the video cameras of the simulation room, and this may support the full and life-like involvement of the participants in the scenarios.

Discussion
Many eye tracking studies have been undertaken in simulated clinical settings. However, eye tracking has mostly been used to assess the performance of clinical skills [4,18,19] or to compare novice and expert clinician's visual scan paths when performing a practical task [19,20]. Eye tracking has also been reported as being an aid during the post scenario debriefing [21,22]. In one study [23], the performance and human factors in high-fidelity simulation of postpartum hemorrhage were analyzed by the visual behavior analysis using a standardized fixed eye tracking protocol, during viewing of a simulated video. In this study, the authors concluded suggesting a "neuroscientific approach with new technology like eye tracking to provide a new objective and more sensitive insights on human factors in simulated medical emergency situations". In line with this previous study, we made an attempt to use a mobile  (glasses) eye tracking technology to investigate behavioral rather than technical or practical skills during a high-fidelity scenario. Our findings suggest that there is an association between the leaders' distribution of gaze across all members of their respective teams, establishing and maintaining eye contact, and their best performance. A prolonged look into the patient's face and a decreased dwelling on areas that are more relevant to technical skills were also associated with the best performing leaders' ocular behavior.
The use of eye tracking technology to capture the eye movements of better-performing clinicians during a high-fidelity simulated scenario may help to provide new insight into behaviors associated with early identification of medical errors and adverse events. Knowledge of eye movements may also potentially supplement other methods of collecting insight into best performances, such as direct observation, verbal reports, and thinking out loud. Eye contact with the patient and team members may be a marker of better communication by the team leader, since it is well-known that eye contact (gazing) is a marker and a tool of good communication, and it is perhaps the most powerful way we communicate [24]. Our preliminary results by using the eye tracking technology support the hypothesis that team leaders should establish direct eye contact with their team, in order to better communicate with them. Interestingly, the leaders who performed best also looked at the face of the patient more.
It is more important for the leader to take the human behavior into account, dedicating more verbal and nonverbal attention to the team and the patient, rather than getting lost in technical details. This principle is obvious enough for those who practice and teach simulation, but we believe that our study demonstrates that it is possible, by using the eye tracking technique, to quantify and measure this behavioral attention. Notably, the previously described visual behavior of the leaders was associated with their team's better technical skill score.
Our study has some limitations. We recognize that as the leader becomes familiar with the team members, the eye contact with them may have changed, and this has to be taken into consideration. However, we still believe that familiarity with the team in the actual life may be a value to highlight and not necessarily a limitation.
We recognize that the experimental situation created for our study is not exactly the one that occurs in real life, when an obstetric emergency usually requires a shared leadership between senior anesthesiologist and senior obstetrician, and that therefore our results cannot be fully extrapolated to clinical practice. The purpose of our study, and consequently its design, was to observe changes in the eye tracking metrics of the leader and not the relationship between the leaders in a shared leadership situation, which requires a much more complex design than ours. Nevertheless, we believe our results may have a value in supporting the eye tracking method as a possible additional tool for the observation of the participants in a simulated scenario.
We also recognize that as much as we have tried to homogenize the clinical evolution of the scenario, a definite standardization was not possible due to the nature of the life-like high-fidelity methodology used.
We are aware that both eye tracking and human behavior analysis are subject to a number of variables that are very difficult to control in a simulation environment. We, however, believe that they could nevertheless present an opportunity to quantify some of these communication variables.

Conclusions
In conclusion, our findings suggest that the leaders who perform the best distribute their gaze uniformly across all members of their team and establish direct eye contact. They also look longer at the patient's face and dwell less on areas that are more relevant to technical skills. In addition, the teams led by these bestperforming leaders fulfilled their clinical task better. This is the first study that has used mobile eye tracking technology to investigate behavioral rather than technical or practical skills during a high-fidelity scenario. The information provided by the eye behaviors of "better-performing physicians" may lay the foundation for the future development of both the assessment process and the educational tools used in simulation.