The researchers found that indicators of future mental health can in fact be picked up from people’s natural language through computational methods. They looked at speech samples from 40 patients in the North American Prodome Longitudinal Study (NAPLS) at Emory University and follow-ups were conducted for two years. To train the model, another 30 participants from the second phase of the study were included. They also used more than 30,000 Reddit posts to draft a speech baseline to build the algorithm,

Those in the prodromal phase of psychosis were more likely to have speech with low levels of semantic density and talked about voices and sounds. During the follow-up, seven individuals converted to psychosis, while 23 did not. Another five converters and five non-converters from the third phase of NAPLS were used to validate the model.

The model unpacked methods of speech through several processing analyses that sorted words by parts of speech and inflections, as well as expressing the meaning of the sentences as word embeddings. The automated process could predict whether an at-risk person would later develop psychosis with 93% accuracy.

The results even went beyond just detecting when psychosis was likely to occur.

“It was previously known that subtle features of future psychosis are present in people’s language, but we’ve used machine learning to actually uncover hidden details about those features,” Phillip Wolff, senior author and a professor of psychology at Emory, said in a statement.

The study is promising for the future of machine learning and natural language processing in the field of mental health, researchers said.

“This research is interesting not just for its potential to reveal more about mental illness, but for understanding how the mind works—how it puts ideas together,” Wolff said. “Machine learning technology is advancing so rapidly that it’s giving us tools to data mine the human mind.”