Skip to main content

Pioneering Data Science for a Data-Driven Future

UC San Diego's Halıcıoğlu Data Science Institute Director Rajesh Gupta says the emerging field of data science enables us to “see” things that have never before been visible and make connections never before possible.

Rajesh Gupta at the podium.
Rajesh Gupta, distinguished professor of computer science and engineering, during a June ceremony celebrating the opening of a new building for the UC San Diego's Halıcıoğlu Data Science Institute, part of an ongoing 5-year anniversary celebration of the institute.

Published Date

Article Content

If you ask computer scientist Rajesh Gupta, modern technology has led humanity to develop a sixth sense – and it's not that hard-to-define intuition or gut instinct that so many of us rely on.

It’s something much more reliable: data.

Data gives people the ability to understand and interact with their surroundings from a much more informed place than ever before, says Gupta, founding director of UC San Diego’s Halıcıoğlu Data Science Institute (HDSI).

“The world of data enables new perceptions that help us navigate the world and provides new tools to do so,” he explains. “Data science allows us to look at the world in much greater depth and breadth through computational eyes, make sense of what we see, and put this knowledge to action.”

At its simplest, data is just information. But large amounts of information - data - created through modern day interactions, connections and conveniences, when looked at through the lens of data science, can improve everything from healthcare to government to education to natural disaster preparedness. It’s this vast trove of networked data that powers the engines of Artificial Intelligence, and that may well help reshape the world as we know it. 

Launched in 2018, HDSI, under Gupta’s leadership, has become an innovative force that pushes the limits of a rapidly growing field by bringing together an interdisciplinary team of researchers from areas ranging from computer science to communications, medicine to philosophy. Working together, these researchers explore new computational methods, new mathematical models and guide us through the societal and ethical impacts of data science.

A fully independent academic unit, HDSI offers a well-grounded education to prepare undergraduate and graduate students for tomorrow’s workforce. The institute’s growth has been rapid: HDSI enrolls 4,800 undergraduate, masters and doctoral students each year, boasts nearly 50 faculty and 15 postdocs, and has more than 700 alumni working in roles that range from machine learning engineers to data analysts. And, HDSI recently celebrated its five-year anniversary by opening the doors to its new academic home, a 45,000 square-foot, state-of-the-art building that will house its faculty, staff, students and four separate degree programs.

In a world that increasingly relies on big data and AI, Gupta takes the responsibility of educating the next generation of data scientists seriously and has ensured it is central to the mission of HDSI.

UC San Diego Today asked Gupta, a distinguished professor of computer science and engineering, about the role of data science in modern society in a recent interview, and how UC San Diego is positioned as a leader in the field. 

First, let's start very basic: What is data science and how is UC San Diego shaping the field?

Data Science is an emerging field at the intersection of computing, mathematics, statistics and human cognition that enables us to make sense of data and put it to effective use by humans and machines. 

The field of data science is rapidly expanding in its depth of theory and methods that enable things like large language models, as well as in its breadth for the direct impact it has on many fields and industry sectors. With the full complement of resources available at a large public university, including its highly ranked medical school, climate and marine sciences, you could assert that UC San Diego is the only university with what it takes to shape this emerging field. To do so, however, we also realized that we needed to teach new things, explore new ways of organizing academic activities before these benefits can be fully realized.

Curriculum and research at UC San Diego focus on both the complex mathematical models that underpin data science, as well as the ethics and practical use of computational models. What does it mean to be skilled in data science, but also aware of the societal implications of its use?

To teach data science, we realized we had to change our approach to curriculum and degree programs in fundamental ways. Traditionally, we look at our courses through the lens of what knowledge and which skills are we imparting. Data science adds a whole new dimension to this: “awareness.” Awareness has many components: reproducibility, responsibility and generalizability of results. We must educate our students to not only be skilled in, say, optimization methods, but also whether or not such an optimization should even be an objective, and/or under what guardrails. 

We have created programs and a curriculum that ensures that data scientists educated at UC San Diego are uniquely equipped to reshape how society works, looks and feels; and how we interact with our environment. We’ve also taken it one step further, ensuring that they understand the imperative to do so responsibly, seeing both the forests and the trees in a world awash with data.

We’ve seen a rise of use in chatbots and AI this past year. How are computational scientists integral in pioneering new, better and safer AI technologies? 

This is still very much a work in progress. We are only beginning to understand the dimensions of trustworthiness and robustness of AI currently being practiced today. Yet, we are far from having working solutions or best practices. Our faculty and researchers are pushing the boundaries of theoretical understanding and practical tools to create a world of “safe AI” technologies, whether entirely a machine generated activity or as an amplification of human activities.

Even at this early stage, we have made advances in understanding how modern AI techniques work, and how they can be manipulated into producing incorrect or malicious outcomes. We are putting these to use in creating the knowledge for better policy and better technology.

What are some of the most exciting developments on the horizon in the field of data science, and how are UC San Diego researchers contributing to this work?

A realization is slowly dawning on the broader community that the lens of data science and computing enables us to “see” things that we could never before. A great example of this is right here at UC San Diego, where scientists are using AI to find patterns that are not visible to practicing physicians in our medical centers.

For instance, at UC San Diego Health, a machine-learning algorithm can detect air trapping in the lungs. And several researchers on campus are advancing the use of AI to develop better models of disease. And according to Eric Topol, our collaborator at Scripps Research, AI can detect kidney disease, diabetes, blood pressure and even liver and gallbladder diseases using retinal images; or predict hyperthyroidism, valve and kidney disease using electrocardiogram results. 

Share This:

Category navigation with Social links