Lexicons are built from medical knowledge sources (SNOMED-CT, ICD-10, UMLS, and Clinical Trials). The dataset is composed of 500 Redditors (anonymized), their posts and domain expert annotated labels.
You can find the link to the research paper of this dataset and the lexicons here. If you find this dataset and lexicons useful in your research, please cite:
@inproceedings{gaur2019knowledge,
title={Knowledge-aware assessment of severity of suicide risk for early intervention},
author={Gaur, Manas and Alambo, Amanuel and Sain, Joy Prakash and Kursuncu, Ugur and Thirunarayan, Krishnaprasad and Kavuluru, Ramakanth and Sheth, Amit and Welton, Randy and Pathak, Jyotishman},
booktitle={The World Wide Web Conference},
pages={514--525},
year={2019},
oganization={ACM}
}