The high-level, long-term goal is to research how to use the Internet of Things to collect data on human behavior in a manner that preserves privacy but provides sufficient information to allow interventions which modify that behavior.
The initial research plan is built around three interrelated levels of analysis: individual, group, and society. At each level, we are investigating the interplay between static and dynamic properties, and paying special attention to the ethical and economic issues that arise when confronting major scientific challenges like this one.
Recent technological advances have enabled collection of diverse health data at an unprecedented level. Omics information of genomes, transcriptomes, proteomes and metabolomes, DNA methylomes, and microbiome as well as electronic medical records and data from sensors and wearable devices provide detailed...
DeepDive is a system to extract value from dark data. Like dark matter, dark data is the great mass of data buried in text, tables, figures, and images, which lacks structure and so is essentially unprocessable by existing software. DeepDive helps bring dark data to light by creating structured data (SQL tables) from unstructured information...
In the late 2000’s, the prices of many staple crops sold on markets in low- and middle-income African countries tripled. Higher prices may compromise households’ ability to purchase enough food, or alternatively increase incomes for food-producing households.
Use of Electronic Phenotyping and Machine Learning Algorithms to Identify Familial Hypercholesterolemia Patients in Electronic Health Records
FIND FH (Flag, Identify, Network and Deliver for Familial Hypercholesterolemia) aims to pioneer new techniques for the identification of individuals with Familial Hypercholesterolemia (FH) within electronic health records (EHRs).
Electronic interfaces to the brain are increasingly being used to treat incurable disease, and eventually may be used to augment human function. An important requirement to improve the performance of such devices is that they be able to recognize and effectively interact with the neural circuitry to which they are connected.
The aim of this proposal is to develop and apply advanced data science techniques to address fundamental challenges of physics event reconstruction and classification at the Large Hadron Collider (LHC). The LHC is exploring physics at the energy frontier, probing some of the most fundamental questions about the nature of our universe.
Mapping the Universe is an activity of fundamental interest, linking as it does some of the biggest questions in modern astrophysics and cosmology: What is the Universe made of, and why is it accelerating? How do the initial seeds of structure form and grow to produce our own Galaxy? Wide field astronomical surveys, such as that planned with...
Mendelian diseases are caused by single gene mutations. In aggregate, they affect 3% (~250M) of the world’s population. The diagnosis of thousands of Mendelian disorders has been radically transformed by genome sequencing.
This project aims to improve in-season predictions of yields for major crops in the United States, as well as a related goal of mapping soil properties across major agricultural states. The project uses a combination of graphical models, approximate Bayesian computation, and crop simulation models to make predictions based on weather and satellite data.
Modern data science is highly exploratory in nature. A typical data analyst does not sit before a computer with a fixed set of hypotheses, but rather arrives at the most interesting questions and patterns after getting his/her hands dirty exploring the data. This exploration process creates complex selection biases in the reported findings and violates the standard independence assumption of statistics and machine learning.
The MyHeart Counts study – launched in the spring of 2015 on Apple’s ResearchKit platform – seeks to mine the treasure trove of heart health and activity data that can be gathered in a population through mobile phone apps. Because the average adult in the U.S. checks his/her phone dozens of times each day, phone apps that target cardiovascular health are a promising tool to quickly gather large amounts of data about a population's health and fitness, and ultimately to influence people to make healthier choices.
Boneh’s lab is working on an efficient mechanism for confidential transactions on the block chain (joint work with Benedikt Buenz). Confidential transactions (CT) is a way for two parties to transact on the block chain without revealing the amount of money that one party is paying the other.
A system has been developed at Stanford that enables using confidential healthcare data among distant hospitals and clinics for creating decision support applications without requiring sharing any patient data among those institutions, thus facilitating multi-institution research studies on massive datasets. This collaboration between Microsoft and Stanford will develop a MS Azure application based on this, thus providing a solution that is robust, usable, and deployable widely at many healthcare institution.