However, the accuracy of commercial devices is largely unknown. Few interventions are as effective as physical activity in reducing the risk of death yet, we have achieved limited success in programs designed to help individuals exercise more. More recent improvements in battery longevity and miniaturization of the processing hardware to turn raw signals in real time into interpretable data led to the commercial development of wrist-worn devices for physiological monitoring. Prior studies of wrist-worn devices have focused on earlier stage devices, or have focused exclusively on HR or estimation of EE.
Some have made comparisons among devices without reference to the U. None proposed an error model or framework for device validation. Devices were tested in two phases. The first phase included the Apple Watch, Basis Peak, Fitbit Surge and Microsoft Band. Stanford University and local amateur sports clubs. In the running test, the subject began the test running at 5. Each minute, the speed was increased by 0.
In order to complete the test within a 10-minute period, the incline was increased by 0. 7 until the subject reached volitional exhaustion. All participants provided informed consent prior to the initiation of the study. The Apple Health app provided heart rate, energy expenditure, and step count data sampled at one minute granularity.
Minute-granularity data was downloaded directly from the Basis app. The mitmproxy software tool was utilized to extract data from the Microsoft Band, following the technique outlined by J. Sampling granularity varied by activity and subject. Mio Alpha 2The raw data from the Mio device is not accessible.
However, static images of the heart rate over the duration of the activity are stored in the Mio phone app. The SQLite3 database stores data sampled at three second granularity. Three-second samples for the last minute of each activity state were averaged to generate heart rate and energy expenditure values for the activity state. Samsung Gear S2Raw data from the Samsung Gear is not accessible to users. However, heart rate and step count over time are displayed as static images within the Samsung Gear App.
Principal component analysis was performed to identify outliers and to cluster devices by error profiles. Variables were not centered, so as to find components of deviation about zero, and the loadings for each principal component were computed. Several regression approaches were applied to uncover associations in the dataset. R was used to fit a linear regression model .
BMI, Von Luschen skin tone, and VO2max. In a parallel approach, a general estimating equation was used to perform a regression analysis with device error as a response variable and device name, activity type, activity intensity, sex, age, height, weight, BMI, skin tone, wrist circumference, and VO2max as predictor variables. Interaction terms between the predictor variables of sex and age, activity and device, and intensity and device were included in the analysis. Regression was then performed with device type as the predictor variable, and the root mean square error values across subjects as the response variable. The Apple Watch served as the base factor value. The effects for other devices served as contrasts with Apple.
R statistics package was used to fit a gamma distribution. Measurement error relative to gold standard was averaged across all devices for a subject. 05 to be within acceptable limits since this approximates a widely accepted standard for statistical significance, and there is precedent within health sciences research for this level of accuracy in pedometer step counting . To gain a sense of the overall performance of each device for each parameter, a mixed effects linear regression model was utilized, allowing for repeated measures on subjects. The lowest error in measuring HR was observed for the cycle ergometer task, 1.
Median error rates across tasks varied from 27. Both principal component analysis and regression via the general estimating equation revealed that activity intensity and sex were significant predictors of error in the measurement of HR. A Tukey HSD difference of means of 43. HR error in the running task for the Apple Watch. The general estimating equation regression identified seven predictor terms that were significantly associated with heart rate device error. Our finding that HR measurements are within an acceptable error range across a range of individuals and activities is important for the consumer health environment and practitioners who might be interested to use such data in a clinical setting.
These findings are in agreement with prior work looking at fewer devices in a smaller number of less diverse individuals . Covariates such as darker skin tone, larger wrist circumference, and higher BMI were found to correlate positively with increased HR error rates across multiple devices. In contrast with low reported error for HR measurement, no device met our prespecified error criterion for energy expenditure. It is not immediately clear why EE estimations perform so poorly. While calculations are proprietary, traditional equations to estimate EE incorporate height, weight, and exercise modality. We only tested devices and algorithms that were available at the time of our study.