Article Text

Download PDFPDF

Original research
Daily self-reported and automatically generated smartphone-based sleep measurements in patients with newly diagnosed bipolar disorder, unaffected first-degree relatives and healthy control individuals
Free
  1. Sharleny Stanislaus1,2,
  2. Maj Vinberg1,
  3. Sigurd Melbye1,2,
  4. Mads Frost3,
  5. Jonas Busk4,
  6. Jakob Eyvind Bardram4,
  7. Maria Faurholt-Jepsen1,
  8. Lars Vedel Kessing1,2
  1. 1 Copenhagen Affective Disorder Research Center (CADIC), Psychiatric Center Copenhagen, Rigshospitalet, Copenhagen, Denmark
  2. 2 Faculty of Medical Sciences, University of Copenhagen, Copenhagen, Denmark
  3. 3 Monsenso Aps, Copenhagen, Denmark
  4. 4 Copenhagen Center for Health Technology (CACHET), Department of Health Technology, Technical University of Denmark, Lyngby, Denmark
  1. Correspondence to Dr Sharleny Stanislaus, Rigshospitalet, Psychiatric Center Copenhagen, Copenhagen 2100, Denmark; sharleny.stanislaus.01{at}regionh.dk

Abstract

Objectives (1) To investigate daily smartphone-based self-reported and automatically generated sleep measurements, respectively, against validated rating scales; (2) to investigate if daily smartphone-based self-reported sleep measurements reflected automatically generated sleep measurements and (3) to investigate the differences in smartphone-based sleep measurements between patients with bipolar disorder (BD), unaffected first-degree relatives (UR) and healthy control individuals (HC).

Methods We included 203 patients with BD, 54 UR and 109 HC in this study. To investigate whether smartphone-based sleep calculated from self-reported bedtime, wake-up time and screen on/off time reflected validated rating scales, we used the Pittsburgh Sleep Quality Index (PSQI) and sleep items on the Hamilton Depression Rating Scale 17-item (HAMD-17) and the Young Mania Rating Scale (YMRS).

Findings (1) Self-reported smartphone-based sleep was associated with the PSQI and sleep items of the HAMD and the YMRS. (2) Automatically generated smartphone-based sleep measurements were associated with daily self-reports of hours slept between 12:00 midnight and 06:00. (3) According to smartphone-based sleep, patients with BD slept less between 12:00 midnight and 06:00, with more interruption and daily variability compared with HC. However, differences in automatically generated smartphone-based sleep were not statistically significant.

Conclusion Smartphone-based data may represent measurements of sleep patterns that discriminate between patients with BD and HC and potentially between UR and HC.

Clinical implication Detecting sleep disturbances and daily variability in sleep duration using smartphones may be helpful for both patients and clinicians for monitoring illness activity.

Trial registration number clinicaltrials.gov (NCT02888262).

  • adult psychiatry
  • depression & mood disorders

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Introduction

In addition to alterations in mood and activity, changes in the sleep–wake patterns are a central symptom of bipolar disorder (BD).1 Sustaining a stable sleep–wake balance is essential for obtaining and maintaining mood stability in patients with BD.2 Apart from alterations in sleep during depression, (hypo)mania and mixed episodes, emerging evidence suggests that sleep disturbances are also present in between mood episodes3–5 and associated with impaired quality of life, global functioning6 and increased risk of relapse.7 8 In the prodromal phase of BD, sleep disturbances seem to be prominent symptoms9 present to a higher degree in first-degree relatives compared with control individuals10 and suggested to be a risk factor for BD.11 12 Overall, these observations underline the critical role of sleep in both diagnoses, monitoring and treatment of patients with BD.

Self-reports of sleep are primarily based on clinical interviews, questionnaires and sleep diaries.13 However, these methods rely on retrospective information with the risk of recall bias and backfilling.14 The gold standard for objective sleep assessment is polysomnography,15 a method not suitable for daily monitoring. A wrist-worn actigraphy is another widely available device correlated with polysomnography16 which is suitable for daily sleep–wake monitoring in naturalistic settings. However, actigraphy assesses sleep–wake rhythm solely based on one parameter (acceleration) and as with any wearable long-term monitoring suffers from adherence concerns.17 Also, due to the Hawthorne effect, awareness of monitoring might cause alterations in sleep–wake rhythm.18 19 Therefore, an unobtrusive and easily accessible platform is necessary to monitor sleep–wake patterns longitudinally, especially in patients with BD presenting with complex behavioural alterations in episodes of different polarities and duration.

Smartphones offer a unique platform to collect self-reported sleep information repeatedly during naturalistic settings. Recently, smartphone-based apps have been developed, aiming to estimate sleep–wake patterns based on both self-reported information and automatically generated data.20–22 The use of information concerning when the screen is turned on/off has been found to be a reliable and low-cost method to infer sleep–wake patterns.23 24 Additionally, the duration the screen is turned on per day has been associated with reduced duration and quality of sleep in healthy individuals.25 Nevertheless, smartphone-based self-reported and automatically generated sleep measurements have not been systematically compared with validated rating scales addressing sleep disturbance. Furthermore, it has not been investigated whether daily smartphone-based self-reported and automatically generated sleep measurements differ between patients with BD, unaffected first-degree relatives (UR) and healthy control individuals (HC).

Objectives

The present study had three overall aims: First, to investigate if smartphone-based self-reported and automatically generated sleep measurements reflected validated measures of sleep including (a) a validated sleep questionnaire and (b) sleep disturbance assessed by trained clinicians according to sleep items of rating scales addressing the severity of depressive and manic symptoms, respectively. Second, to investigate associations between automatically generated smartphone-based sleep measurements and self-reported smartphone-based sleep. Third, to investigate whether smartphone-based self-reported and automatically generated sleep measurements would be able to discriminate between patients with newly diagnosed BD, UR and HC.

We hypothesised that (1) smartphone-based self-reported and automatically generated sleep (measured as mean hours slept between 12:00 midnight and 06:00, and daily variability in hours slept between 12:00 midnight and 06:00) would be associated with global measures of sleep; (2) daily smartphone-based self-reported and automatically generated sleep measures would be associated and (3) patients with BD would have higher day-to-day variability in sleep timing and more sleep disturbances compared with HC and UR with UR representing an intermediary group.

Methods

Study design

The present study is a part of the Bipolar Illness Onset (BIO) study.26 Three groups of participants were included in the study: patients with BD, UR and HC. The data included in the present paper were collected from participants enrolled in the BIO study from September 2016 to February 2019. All participants were invited for assessments at baseline and annually for up to 3 years. Additionally, patients with BD were invited for assessments after a shift in a mood episode.

Study participants

The schedules of Clinical Assessment in Neuropsychiatry (SCAN) interview27 was used to confirm the diagnosis of BD (or the lack of) according to the International Classification of Diseases 10th version (ICD-10).28

All newly diagnosed patients with BD living in the Capital Region of Denmark, who were offered a 2-year programme at the Copenhagen Affective Disorder Clinic Copenhagen, Denmark, were invited to participate in the study.29 The criteria for inclusion were newly diagnosed BD/single manic episode according to the ICD-10 and the age of 15–70 years.

Unaffected relatives, including siblings or children to the patients with BD, were invited to participate in the study. Exclusion criteria were any previous or current psychiatric diagnosis lower than F34.0 according to ICD-10 (ie, organic mental disorders, mental and behavioural disorders due to psychoactive substance use, schizophrenia or other psychotic disorders, affective disorders).

HC, without a treatment requiring psychiatric disorder in the individual or among the individuals’ first-degree family members, were recruited among blood donors, age 15–70 years, from the Blood Bank at Rigshospitalet, Copenhagen, Denmark.

Observer-based rating scales and self-reported questionnaires were administered at each visit with the researchers. Due to the longitudinal design of the study with different number of visits with the researchers for each participant, some participants provided repeated self-reported and clinician-based rating scales.

Observer-based rating scales

Sleep items from two observer-based clinical rating scales, namely the Hamilton Depression Rating Scale 17-item (HAMD-17)30 and the Young Mania Rating Scale (YMRS),31 were used to assess the degree of sleep disturbances for the past 3 days. A summary measure of subitem 4–6 of HAMD-17 was used to evaluate the degree of insomnia and item 4 of the YMRS was used as an estimate of reduced sleep duration.

Patient-reported questionnaires

The Pittsburgh Sleep Quality Index (PSQI), a self-reported retrospective questionnaire, was used to evaluate overall sleep quality for the past month.32 It consists of 19 items comprising 7 sleep factors, including estimates of sleep duration, subjective sleep quality, sleep disturbances, sleep efficiency, sleep latency, daytime dysfunction and use of sleeping medication. Higher scores indicate reduced sleep quality.32 The PSQI has been found to have a good test–retest reliability for the global score.33

Smartphone-based monitoring

All included participants installed an app on their iPhone or Android smartphones (the Monsenso app). Automatically generated smartphone-based data, including information regarding screen on/off, was obtainable only from Android smartphones. All participants daily entered bedtime (defined as the time the light was turned off and the participant tried to sleep) and wake-up time (final awakening time) (figure 1). Participants were asked to enter smartphone-based data in real time, that is, daily. If the participant forgot to enter data for any specific day, it was possible to enter data retrospectively for up to 2 days. Self-reports could be filled out the whole day until 12:00 midnight. Using the reminder functionality of the smartphone, the participants were reminded daily to fill in the survey. The study was observational, and participants did not receive feedback via the app. Patients with BD were asked to fill in self-assessment as long as possible and preferably for at least 3 months, whereas the UR and HC were asked to fill in daily assessment for at least 1 month. Based on the smartphone-based self-reported bedtime and wake-up time, we calculated sleep duration (bedtime – wake-up time), mid-sleep (bedtime +sleep duration/2) and the number of hours slept between 12:00 midnight and 06:00. Daily variation in sleep duration and the number of hours slept between 12:00 and 06:00 were calculated by applying the root mean successive difference (RMSSD) method, taking the square root of the sum of the squared differences between daily and previous day’s sleep evaluation.34 The automatically generated screen features (the number of times the screen was turned on and off per day and the intervals between on and off times) were used to estimate the time the screen was off between 12:00 and 06:00. We used the screen on/off features based on the assumption that most people lock their phones during bedtime and unlock their phones after awakening. Therefore, the screen on/off time may be a good estimate for sleep.25

Figure 1

Sleep registration on smartphone-based application, Monsenso.

Statistical methods

All hypotheses and statistical analyses were planned a priori. Differences in demographic variables and clinical characteristics were analysed using mixed models for continuous variables and χ2 for categorical variables.

For each measure of interest, a two-level linear mixed-effect model, which accommodates both the variation of the variables of interest within patients (intraindividual variation) and between individuals (interindividual variation), was employed. Familial relationships and repeated measures for each participant were accounted for by adding family number and identification number as random factors. In analyses comparing mean differences between the three groups (patients with BD, UR and HC), groups were added as a fixed factor. For each measure of interest, an unadjusted model and a model adjusted for age and sex were conducted. The linear mixed-effect model also allows us to include all observations and not just complete data sets. Thus, all available daily smartphone-based assessments of sleep and automatically collected smartphone-based data were included in the statistical analyses.

For the objectives of the study, first we investigated whether the number of hours and daily variability in hours slept between 12:00 and 06:00, calculated based on smartphone-based self-reported and automatically generated sleep measurements, were associated with sleep quality measured with PSQI and clinically rated sleep items of the HAMD-17 and the YMRS. Summary measures of smartphone-based sleep indices were calculated for the same time period as addressed by the self-reported and clinical rating scales. Second, we investigated whether the estimation of sleep duration, calculated from the automatically collected smartphone data during the hours between 12:00 and 06:00, was associated with same-day self-reported sleep between 12:00 and 06:00. Third, we investigated whether clinician-rated and smartphone-based measurements of sleep differed between patients with BD, UR and HC.

Since the field has been investigated in a few studies only, and due to the explorative nature of the present study, adjustment for multiple testing was not conducted. Thus, p values below 0.05 were considered statistically significant. Model assumptions were checked graphically by using residuals and predicted values. All statistical analyses were conducted using the Statistical Package of the Social Sciences (SPSS) V.22.

Findings

Socio-demographic and clinical characteristics

A total of 240 patients with BD, 66 UR and 117 HC were included in the BIO-study from September 2016 to February 2019. Of these, a total of 203 patients with BD, 54 UR and 109 HC individuals were included in the present study and provided self-reported smartphone-based data concerning sleep. The main reasons for not willing to participate being that the study was too time-consuming, surveillance concerns and not owning a smartphone and did not want to borrow a smartphone. Automatically smartphone-generated data were only obtainable from Android smartphones and measurements of the screen on/off were provided for 71 patients with BD, 15 UR and 32 HC. Socio-demographic and clinical characteristics for the participants are presented in table 1. In total, the participants from the three groups provided smartphone-based self-reported data for 49 153 days (patients with BD 31 687; UR 5873; HC 11 593) and automatically generated smartphone-based data were collected for a total of 29 001 days (patients with BD 17 822 days; UR 3426 days; HC 7753 days).

Table 1

Socio-demographic and clinical characteristics of patients with BD, UR and HC at baseline, n=366

Daily smartphone-based sleep measurements

As presented in table 2, we found that shorter smartphone-based self-reported sleep duration between 12:00 and 06:00 was associated with a worsening of sleep quality measured using the total PSQI score (B =−0.07, 95% CI: −0.10 to −0.05, p<0.001). Also, there was a statistically significant negative association between self-reported smartphone-based hours slept between 12:00 and 06:00 and sleep items on both the HAMD-17 (B=−0.10, 95% CI: −0.15 to −0.05, p<0.001) and the YMRS (B=−0.30, 95% CI: −0.43 to −0.17, p<0.001). Moreover, smartphone-based self-reported sleep duration was associated with sleep duration evaluated using the PSQI (adjusted for age and sex, B=0.73, 95% CI: 0.64 to 0.82, p<0.001).

Table 2

Associations between smartphone-based sleep indices* and a self-reported sleep questionnaire†, and clinically rated depressive‡ and manic§ symptoms, respectively

There were no associations between automatically generated smartphone-based estimates of hours slept between 12:00 and 06:00 and any self-rated or clinician-rated sleep measurements.

Daily automatically generated versus self-reported smartphone-based sleep measurements

As can be seen in table 3, the automatic sampling of sleep (between 12:00 and 06:00) using the screens on/off events is a proxy for sleep. There was a statistically significant positive association with smartphone-based self-reported sleep time (between 12:00 and 06:00) and hence might represent a potential marker for sleep duration (B=0.28, 95% CI: 0.26 to 0.30, p<0.001).

Table 3

Associations between self-reported smartphone-based sleep and automatically smartphone-generated sleep between 12:00 and 06:00

Differences in daily smartphone-based sleep measurements between patients with newly diagnosed BD, UR, and HC

As can be seen from the upper part of table 4, the day-to-day variability in total sleep duration was higher in patients with BD compared with UR and HC, respectively (model adjusted for age and sex: all p values <0.001). Furthermore, in patients with BD, the total sleep time between 12:00 and 06:00 was shorter compared with UR and HC. Interestingly, the total sleep duration did not differ between the three groups. However, this may be explained by patients with BD having more variability in sleep duration (figure 2). The total PSQI scores and subitems addressing sleep on the HAMD-17 were statistically significantly higher in UR compared with HC individuals. Other measures were not statistically significant between UR and HC individuals. In exploratory subanalysis, including only visits when participants were in full or partial remission (HAMD-17 and YMRS <14), the differences between groups in the PSQI, the HAMD-17 and the YMRS scores were comparable with the results presented in table 4 (results not presented).

Figure 2

Clustered bar chart illustrating percentage of time with self-reported sleep duration by group.

Table 4

Estimated differences in sleep indices between patients with BD, UR and HC (n=366)

The lower part of table 4 shows that between 12:00 and 06:00 patients with BD had decreased number of hours with the screen turned off, increased day-to-day variability in hours with the screen turned off and increased number of times the screen was turned on compared with HC. Although, none of these differences reached statistical significance.

Discussion

This study showed three things. First, smartphone-based self-reported hours slept between 12:00 midnight and 06:00 reflected measures of sleep according to the PSQI, and sleep items on the HAMD and the YMRS, whereas automatically smartphone-generated sleep measures calculated based on the screen on/off time did not. Second, automatically smartphone-generated calculation of hours slept between 12:00 and 06:00 was associated with daily smartphone-based self-reported hours slept between 12:00 and 06:00. Third, according to smartphone-based self-reported and automatically generated measurements, patients with BD slept less between 12:00 and 06:00 and with more interruption and variability in daily sleep patterns compared with HC, although differences in automatically smartphone-generated measures did not reach statistical significance. UR had numerically intermediary levels between patients and HC on most smartphone-based sleep measurements but did not differ statistically significant from HC.

Estimation of sleep from smartphone-based data

This study overall indicates that smartphones could be a potential platform for long-term sleep monitoring. Also this finding is supported by other studies supporting the use of smartphones for sleep monitoring in patients with schizophrenia and healthy individuals with sleep problems.21 35 The lack of statistical significant difference in automatically estimated sleep between the three groups could be explained by decreased amount of automatically smartphone-generated data with decreased statistical power, which increases the risk of type II error and the fact that we exclusively used the screen on/off time to estimate sleep. Participants are not necessarily at sleep when the screen is off, which could result in a higher estimation of sleep time using smartphone-based sensor data compared with self-reported. Especially bedtime might be estimated to be earlier than self-reported bedtime.21 Therefore, screen off time between 12:00 and 06:00 could be estimated using more advanced machine learning methods24 or possibly by combining screen features with other smartphone-based features, suitable for sleep monitoring such as ambient light, Global Positioning System locations, Fitbit watch, etc. to improve sleep characterisation.21 35–37 However, these methods can be heavily battery consuming, which limits realistic use by patients during naturalistic settings.

Differences in sleep measurements between study groups

Interestingly, we found that patients with newly diagnosed BD have more irregular sleep patterns and reduced sleep quality compared with UR and HC, respectively. This could indicate that sleep disturbances are present at early illness stages. Specifically, patients with BD had a delayed sleep pattern (mean 26 min later bedtime and 41 min delayed wake-up time) and slept less between 12:00 and 06:00 than HC. Also, patients with BD had more variability in daily sleep patterns. The latter is in accordance with findings in a meta-analysis.5 Interestingly, the mean PSQI total score for UR was 5.01, just above the cut-off for poor sleepers.32 The UR group scores for three of the subitems on the PSQI sleep latency, sleep disturbance and sleep quality were significantly different from those of the HC group. This is in accordance with several other studies that have reported sleep disturbances in individuals at risk of developing BD.10 11 38 Studies point towards that especially sleep onset latency, awakening during night time as well as overall sleep quality could be trait and state marker for BD.3 11 39 40 The present study did not include smartphone-based self-assessment of sleep satisfaction, but used questionnaire-based measures of sleep quality. Combining both smartphone-based subjective sleep reports and automatically generated sleep estimations could potentially provide a more sensitive and specific estimate of sleep variability, sleep latency and sleep disturbance and may be clinically useful in detecting emerging patterns of sleep disturbance and represent a potential diagnostic, risk and treatment marker for BD.

Limitations

First, we investigated associations between smartphone-based sleep measures against PSQI and sleep items on HAMD-17 and YMRS. Additionally, future studies comparing automatically generated smartphone-based sleep measurements against actigraphy-based sleep measurements or polysomnography would be highly relevant for validating objective sleep-monitoring methods. Second, participants were included regardless of mood state. Additionally, the patients with BD included in the present study presented rather low levels of depressive and manic symptoms during the study period. Thus, the associations and differences in sleep parameters when comparing the patients with BD, UR and HC will not reflect associations and differences during more severe affective states. However, the patients with BD still had sleep disturbances even though most were in remission. Third, linear mixed-effect models were used for the statistical analyses using all available data points. However, missing data were not accounted for in the statistical analyses, and thus were assumed to be missing at random. It should be considered that the patients during depressive and manic periods had alterations in adherence to self-monitoring, and thus the missing data may contain clinically useful information.41 Fourth, we investigated estimated sleep time between 12:00 and 06:00. A larger time window would have been interesting to investigate, especially for those with a delayed sleep phase. However, we were interested in how much time the participants slept during nighttime. It is also more likely that participants are awake after 06:00, without using the phone, because they are at work, school, etc.; therefore, a single attribute, as screen off time, may not be an adequate measure of sleep in the morning. There exist several sensors that could be suitable for monitoring sleep and integrating data from more than one of these sensors or applying more advanced machine learning algorithms may possibly represent a more accurate sleep measurement that can capture total sleep duration instead of only an estimate of sleep during nighttime. Also, phone usage may vary vastly between individuals. Therefore, it may be that the best estimate of sleep measures is a combination of self-reported and automatically generated sleep measurements. Fifth, we collected automatically generated smartphone data across multiple platforms, and there may be possible variations in automatically generated smartphone data between platforms, which we did no account for in the present study. Sixth, due to the sample size of the present study, we did not investigate potential differences between participants using Android smartphones and iPhone smartphones in smartphone-based self-assessed sleep measures. There may be potential demographic differences between these groups given the comparative costs of the devices. Seventh, despite that the three groups were matched on sex and age, work status differed, which probably influence sleep–wake patterns and may explain parts of the variability between patients with BD and UR/HC. Eighth, patients with BD were taking psychotropic medication, and this was not accounted for in the analyses. Finally, all patients were followed in the Copenhagen Affective Disorder Clinic, where stabilisation of the sleep–wake pattern is an essential part of the treatment. Therefore, sleep disturbances in this group might not be representable of all patients with BD.

Strengths

This is the first large observational study with 366 systematically recruited participants, including patients with newly diagnosed BD and UR. Diagnosis (and lack of diagnosis) was verified by using a SCAN interview. On a daily basis, participants reported sleep in real time, and because the smartphone system was used, recall bias was minimised. In addition, the participants were clinically evaluated and completed questionnaires during the study period. The clinical assessments used included the HAMD-17 and the YMRS, which, in contrast to the questionnaires, are clinical rating scales administered by an experienced clinical researcher.

Conclusions

Daily self-reported smartphone-based data may represent measurements of sleep patterns that differ between patients with BD and HC and potentially between UR and HC. Automatically smartphone-generated sleep measurements may hold promises but should be investigated in more detail and potentially include additional smartphone-based sensor data, more advanced feature extractions and algorithms to estimate circadian rhythm 24/7.

Acknowledgments

Special thanks to all the participants in the study for their valuable contribution and the clinicians at Copenhagen Affective Disorder Clinic, Copenhagen University Hospital, Denmark.

References

Footnotes

  • Contributors LVK and MF-J conceived the study. LVK obtained the required funding for the study and wrote the study protocol. LVK, MF-J, MV, JEB and MF were involved in optimising the study protocol. SS, SM and MV have been responsible for the recruitment of participants and have carried out the assessment and data collection. MF, JB and SS have been responsible for data processing. Data analysis was done by SS and MF-J and supervised by LVK. Interpretation of the data has been done by SS under the supervision of LVK and MF-J. All authors have read, contributed to and approved the final version of the manuscript. MF-J and LVK share last authorship.

  • Funding The study was funded by grants from the Mental Health Services, Capital Region of Denmark, The Danish Council for Independent Research, Medical Sciences (DFF-4183-00570), Weimans Fund, Markedmodningsfonden (the Market Development Fund) (2015-310), Gangstedfonden (A29594), Helsefonden (16-B-0063), Innovation Fund Denmark (5164-00001B), Copenhagen Center for Health Technology (CACHET), EU H2020 ITN (EU Project 722561), Augustinusfonden (16-0083), Lundbeck Foundation (R215-2015-4121).

  • Competing interests LVK and MV has been a consultant for Lundbeck within recent 3 years. JEB and MF are co-founders and shareholders of Monsenso ApS.

  • Patient consent for publication Not required.

  • Ethics approval The Bipolar Illness Onset (BIO) study has been approved by the ethics committee in the Capital Region, Copenhagen, Denmark (Ref. No. H-7-2014-007) and the Danish Data Protection Agency, Capital Region of Copenhagen (Protocol No. RHP-2015-023). The study was conducted in accordance with the Declaration of Helsinki and all participants provided written informed consent.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data availability statement No data are available. The study is ongoing. Therefore, the research data are not shared.