Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Feb 20, 2018
Open Peer Review Period: Feb 20, 2018 - Aug 2, 2018
Date Accepted: Sep 24, 2018
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

How Search Engine Data Enhance the Understanding of Determinants of Suicide in India and Inform Prevention: Observational Study

Adler N, Cattuto C, Kalimeri K, Paolotti D, Tizzoni M, Verhulst S, Yom-Tov E, Young A

How Search Engine Data Enhance the Understanding of Determinants of Suicide in India and Inform Prevention: Observational Study

J Med Internet Res 2019;21(1):e10179

DOI: 10.2196/10179

PMID: 30609976

PMCID: 6682304

How Search Engine Data Enhance the Understanding of Determinants of Suicide in India and Inform Prevention: A Population Study

  • Natalia Adler; 
  • Ciro Cattuto; 
  • Kyriaki Kalimeri; 
  • Daniela Paolotti; 
  • Michele Tizzoni; 
  • Stefaan Verhulst; 
  • Elad Yom-Tov; 
  • Andrew Young

ABSTRACT

Background:

India is home to 20% of the world’s suicide deaths. Although statistics regarding suicide in India are distressingly high, data and cultural issues likely contribute to a widespread underreporting of the problem. Social stigma and only recent decriminalization of suicide are among the factors hampering official agencies’ collection and reporting of suicide rates.

Objective:

As the product of a data collaborative, this paper leverages private-sector search engine data toward gaining a fuller, more accurate picture of the suicide issue among young people in India. By combining official statistics on suicide with data generated through search queries, this paper seeks to: add an additional layer of information to more accurately represent the magnitude of the problem, determine whether search query data can serve as an effective proxy for factors contributing to suicide that are not represented in traditional datasets, and consider how data collaboratives built on search query data could inform future suicide prevention efforts in India and beyond.

Methods:

We combined official statistics on demographic information with data generated through search queries from Bing to gain insight into suicide rates per state in India as reported by the National Crimes Record Bureau (NCRB) of India. We extracted English language queries on “suicide,” “depression,” “hanging,” “pesticide,” and “poison”. We also collected data on demographic information at the state level in India, including urbanization, growth rate, sex ratio, internet penetration, and population. We modeled the suicide rate per state as a function of the queries on each of the 5 topics considered as linear independent variables. A second model was built by integrating the demographic information as additional linear independent variables.

Results:

Results of the first model fit (R2) when modeling the suicide rates from the fraction of queries in each of the 5 topics as well as the fraction of all suicide methods show a correlation of about 0.5. This increases significantly with the removal of 3 outliers and improves slightly when 5 outliers are removed. Results for the second model fit using both query and demographic data show that for all categories, if no outliers are removed, demographic data can model suicide rates better than query data. However, when 3 outliers are removed, query data about pesticides or poisons improves the model over using demographic data.

Conclusions:

In this work, we used search data and demographics to model suicide rates. In this way, search data serve as a proxy for unmeasured (hidden) factors corresponding to suicide rates. Moreover, our procedure for outlier rejection serves to single out states where the suicide rates have substantially different correlations with demographic factors and query rates.


 Citation

Please cite as:

Adler N, Cattuto C, Kalimeri K, Paolotti D, Tizzoni M, Verhulst S, Yom-Tov E, Young A

How Search Engine Data Enhance the Understanding of Determinants of Suicide in India and Inform Prevention: Observational Study

J Med Internet Res 2019;21(1):e10179

DOI: 10.2196/10179

PMID: 30609976

PMCID: 6682304

Per the author's request the PDF is not available.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.