Spring Data Science Day: May 10, 2019

Monday, May 6, 2019
James McGirk

A remarkable roster is in store for Spring 2019 Data Science Day. UC Santa Cruz faculty from an array of disciplines including computer science and engineering, literature, economics, statistics, sociology, and computational media will join policy directors and Stanford’s Mehran Sahami for a day of short talks, discussions, and poster presentations all focused on what is fast becoming one of the most important academic areas of the 21st century.

UC Santa Cruz Spring Data Science Day will take place on Friday, May 10 from 9am-4pm in Engineering 2, room 180 and the Baskin Engineering Courtyard.

This year’s focus is on responsible data science and data science for social good. Responsible data science requires fairness, interpretability, bias, explainability, privacy, accountability and more. The day will feature short talks by many of our local talent (including lots of our new faculty), a poster session, and a keynote by Mehran Sahami from Stanford University.

Click here to see the schedule of events.
The day will begin at 9:05 (come earlier for coffee and a continental breakfast) with a session discussion on Responsible Data Science moderated by Abel Rodríguez, associate dean for graduate affairs and Angus Forbes, computational media professor.

9:05-10:30: Responsible Data Science Session
The Professor and the Dashboard: A Cautious Approach to Classroom Data Analytics

The first talk will be from Jody Green, professor of literature and associate vice provost for teaching and learning (Director of the Center for Innovations in Teaching and Learning) who will discuss the ways new data analytics tools created at the University of California are being used to track and support student success at the classroom level.
Situated, Collaborative Modeling: Critical Participatory Data Science in Rural Zimbabwe with the Muonde Trust
Melissa Viola Eitzel Solera of the UCSC Science and Justice Research Center will discuss the way that modeling needs to be practiced more critically and less automatically and suggests that practices be grounded in History of Consciousness Professor Emerita Donna Haraway’s situated knowledge. She then will demonstrate applying these principles to her worn collaborative modeling efforts with the Muonde Trust in Mazvihwa Communal Area, Zimbabwe.

Fair Algorithms

Baskin School of Engineering’s own Yang Liu, professor of computer science and engineering, will discuss algorithms and machine learning (ML). Algorithms may appear to be fair, but without human intervention can be riddled with potential biases or discriminations. Liu will also discuss the following questions: How does one guarantee the quality of data collected from potentially careless or even malicious human agents? How does one build ML methods that are robots despite noise in the data? And how would you guarantee fair and transparent treatment of people when ML is deployed?

Abhradeep Guha Thakurta, Computer Science and Engineering, UC Santa Cruz will also present.

10:30-10:45: Break
10:45-11:45: Keynote
Ethical Considerations in Data Science Mehran Sahami, Stanford University

Mehran Sahami will deliver the keynote for the event. Sahami is a professor and associate chair for education in the computer science department at Stanford University. His talk will examine some of the promise and perils that arise from work in data science. He will consider specific examples to take deep dives into ethical issues to understand both the technical and competing value trade-offs at stake.


11:30-12:00: Poster Highlights
12:15-1:15 Lunch
1:15-2:30: Data Science for Social Good Session

Jim Whitehead, professor of computational media and chair of the computational media department, will moderate a session titled “Data Science for Social Good.”

Data Dividends?: Rethinking Work and the Commons in the Era of Big Data

In this talk, Chris Brenner, director of the Everett Program for Technology and Social Change and the Institute for Social Transformation will discuss California Governor Gavin Newsom’s proposal to implement a data dividend and his current work discussing a universal technology dividend and explore questions related to the common-property characteristics of technology and innovation, the monopolistic characteristics of information markets, and the need to rethink how we define work in contemporary labor markets.

Causal Inference Without an Experiment

Carlos Dobkin in economics department at UC Santa Cruz and the National Bureau of Economic Research will discuss the shortcomings of Randomized Controlled Trials (the gold standard approach to generating causal estimates, critical for creating unbiased estimates of treatments for forming effective government policy). He will discuss how Regression Discontinuity Design can be used in some situations where ethical or practical restraints make it impossible to implement an RCT.

Modeling for Seasonal Marked Point Processes: An Analysis of Evolving Hurricane Occurrences

Athanasios Kottas, graduate director of the statistics department at the Baskin School of Engineering at UC Santa Cruz will present a Bayesian nonparametric modeling approach to study the dynamic evolution of a seasonal marked point process intensity. The research will be based on his analysis of hurricane landfalls along the Gulf and Atlantic coasts from 1900 to 2010.

Using Big Data to Improve Students' Educational Outcomes in the Silicon Valley

Rebecca London, professor of sociology will give a talk exploring the Silicon Valley Regional Data Trust, a data-sharing collaborative between education, juvenile probation, child welfare and behavioral health agencies. Her talk will highlight ways that big data can be harnessed for social good and the challenges to overcome in creating systematic and ethical changes to everyday data collection practices in youth-serving organizations.

Real-world benefits of Machine Learning in healthcare

Narges Norouzi, computer science and engineering teaching professor at the Baskin School of Engineering will give a brief survey of the efforts in the Applied Machine Learning Lab in the computer science and engineering department. She will discuss state-of-the-art machine learning healthcare applications and their efforts in those areas, data sources being used and their data drives investigations, the ethics of using algorithms in healthcare and future applications of the technology.

The events will conclude with an hour and twenty minutes to peruse posters in the Baskin Engineering Courtyard.
We hope that you’ll join us.

For more information please visit: https://data-science-day.soe.ucsc.edu UCSC Spring Data Science Day will take place on Friday, May 10 from 9am-4pm in Engineering 2, room 180 and the Baskin Engineering Courtyard.