The Stanford Data Science Initiative (SDSI), founded in 2014, has worked to provide researchers with new technologies and resources. Now in its fourth year, SDSI is proud to be the foundation of Stanford University's major and expanded new efforts to develop a campus wide Data Science ecosystem.
The Stanford Data Science Initiative aims to make Stanford a data enabled university. The Initiative advances data science methods and tools, and weaves them into the fabric of the university, to effectively respond to our most pressing societal and scientific challenges.
The world is being transformed by data and data-driven analysis is rapidly becoming an integral part of science and society. The Stanford Data Science Initiative, is a collaborative effort across many departments in all seven schools. We strive to unite existing data science research initiatives and create interdisciplinary collaborations, connecting the data science and related methodologists with disciplines that are being transformed by data science and computation.
The initiative works to support research in a variety of fields where incredible advances are being made through the facilitation of meaningful collaborations between domain researchers, with deep expertise in societal and fundamental research challenges, and methods researchers that are developing next generation computational tools and techniques, including:
Education is at the core of the University’s mission. The rise of new data science tools and techniques provides both challenges and opportunities. Education researchers have ever-larger datasets on critical K-12 teaching, but analyzing these datasets requires more and more sophisticated techniques. Similar data is becoming available at the undergraduate level, for example from the use of teaching platforms here at the University. Data science can help us deal with these challenges, bringing new methods and analysis techniques that can help devise new teaching and intervention strategies that lead to more effective learning.
Human health data is collected all levels of resolution: from single cell genetic information all the way to individual and population level. By integrating and analyzing these massive traces of data we are provided with an unparalleled opportunity to realize new types of scientific approaches that provide novel insights about our lives, health, and happiness. The next frontier in health research will come from integrating such data so that effective treatments can be personalized to each individual human. However, gaining valuable insights from these data requires new data science techniques approaches that turn observational, noisy data into strong scientific results and can further our knowledge.
Data science can enhance our understanding of humanity, from studying individual human behavior to creating better models of human communities, organizations, cultures, and societies. Data science can provide advanced tools to interpret the data (massive in both modern and historical terms) that document the behavior of people, groups, organizations, and of society more generally, and help us reason about causal relationship and the complex motivations of societal actors. This is critical for advances across the social sciences, humanities, business, education, and law.
Data has the potential to impact how we think and treat our environment. The understanding of our physical surroundings and processes are quickly advancing with new data available from sensors aboard satellites and on the ground. Analyses of these data sources now require incredibly sophisticated scientific computation to address our planet’s greatest natural challenges and our place in the universe.
Data Science for Public Policy
New sources of data and digitized historic information enable new questions to be asked about public policy. At the same time, there is an increasing need to better understand the ethical and legal implications of the rapid uptake of digital technology in terms of the future of work and political discourse.
As data science tools and techniques are now playing a pivotal role in decision making across all fields, it is important to acknowledge and understand the full implications of our work. It is no longer enough for ethical considerations to be made late in, or after, a project has been largely completed. Nor is it enough for ethical considerations in data science to be considered outside the technical work itself. Instead, the future of a data enabled University will require a deeply collaborative and hands on relationship between scientists and ethicists, both working at the forefront of their fields, to fully consider outcomes from the start.
Future data science must provide both an enhanced capability to reason algorithmically from data and, even more crucially, confidence in the validity of the results particularly at the societal level. Challenges include algorithmic bias and fairness, the transparency, interpretability and accountability of decisions, and the security and privacy of data. Further advancements are needed in terms of predictive as well as causal modeling. The immense future challenges facing the world can only be met with support from data science continually enhanced by bold, wide-ranging and socially responsible research.