This article from New York Times illustrates the potential of drawing large amount of unstructured data (Big Data) in identifying medication safety, not just in the clinical setting but also the potential of addressing it in a public health context . Enjoy 🙂
Using data drawn from queries entered into Google, Microsoft and Yahoo search engines, scientists at Microsoft, Stanford and Columbia University have for the first time been able to detect evidence of unreported prescription drug side effects before they were found by the Food and Drug Administration’s warning system.
Using automated software tools to examine queries by six million Internet users taken from Web search logs in 2010, the researchers looked for searches relating to an antidepressant, paroxetine, and a cholesterol lowering drug, pravastatin. They were able to find evidence that the combination of the two drugs caused high blood sugar.
The study, which was reported in the Journal of the American Medical Informatics Association on Wednesday, is based on data-mining techniques similar to those employed by services like Google Flu Trends, which has been used to give early warning of the prevalence of the sickness to the public.
The F.D.A. asks physicians to report side effects through a system known as the Adverse Event Reporting System. But its scope is limited by the fact that data is generated only when a physician notices something and reports it.
The new approach is a refinement of work done by the laboratory of Russ B. Altman, the chairman of the Stanford bioengineering department. The group had explored whether it was possible to automate the process of discovering “drug-drug” interactions by using software to hunt through the data found in F.D.A. reports.
The group reported in May 2011 that it was able to detect the interaction between paroxetine and pravastatin in this way. Its research determined that the patient’s risk of developing hyperglycemia was increased compared with taking either drug individually.
The new study was undertaken after Dr. Altman wondered whether there was a more immediate and more accurate way to gain access to data similar to what the F.D.A. had access to.
He turned to computer scientists at Microsoft, who created software for scanning anonymized data collected from a software toolbar installed in Web browsers by users who permitted their search histories to be collected. The scientists were able to explore 82 million individual searches for drug, symptom and condition information.
The researchers first identified individual searches for the terms paroxetine and pravastatin, as well as searches for both terms, in 2010. They then computed the likelihood that users in each group would also search for hyperglycemia as well as roughly 80 of its symptoms — words or phrases like “high blood sugar” or “blurry vision.”
They determined that people who searched for both drugs during the 12-month period were significantly more likely to search for terms related to hyperglycemia than were those who searched for just one of the drugs. (About 10 percent, compared with 5 percent and 4 percent for just one drug.)
They also found that people who did the searches for symptoms relating to both drugs were likely to do the searches in a short time period: 30 percent did the search on the same day, 40 percent during the same week and 50 percent during the same month.
“You can imagine how this kind of combination would be very, very hard to study given all the different drug pairs or combinations that are out there,” said Eric Horvitz, a managing co-director of Microsoft Research’s laboratory in Redmond, Wash.
The researchers said they were surprised by the strength of the “signal” that they detected in the searches and argued that it would be a valuable tool for the F.D.A. to add to its current system for tracking adverse effects. “There is a potential public health benefit in listening to such signals,” they wrote in the paper, “and integrating them with other sources of information.”
The researchers said that they were now thinking about how to add new sources of information, like behavioral data and information from social media sources. The challenge, they noted, was to integrate new sources of data while protecting individual privacy.
Currently the F.D.A. has financed the Sentinel Initiative, an effort begun in 2008 to assess the risks of drugs already on the market. Eventually, that project plans to monitor drug use by as many as 100 million people in the United States. The system will be based on information collected by health care providers on a massive scale.
“I think there are tons of drug-drug interactions — that’s the bad news,” Dr. Altman said. “The good news is we also have ways to evaluate the public health impact.