|
Fangzhu Shen
Hello! My name is Fangzhu Shen (沈芳竹). I am a third-year PhD student in Computer Science Department at Duke University, advised by Prof. Sudeepa Roy.
I am broadly interested in database management and data analysis, focusing on the end-to-end data lifecycle. I have been working on problems in data cleaning, influence attribution in social networks, and selectivity estimation in online learning setting. Currently, I am exploring how large language models are transforming data management and the new challenges introduced by them, and query verification with formal methods.
I welcome discussions about new research ideas! Whether you're interested in potential collaborations or want to chat about my research, I'd love to connect.
I am actively looking for internships opportunities in Summer 2026. Please reach out if you have any opportunities!
|
Email /
CV /
Google Scholar
DBLP /
Github /
LinkedIn
|
|
|
Publications
(* = equal contributions)
The Cost of Representation by Subset Repairs
Yuxi Liu*, Fangzhu Shen*, Kushagra Ghosh, Amir Gilad, Benny Kimelfeld, Sudeepa Roy
Proceedings of the VLDB Endowment (PVLDB), 2025.
[pdf] [code]
A novel framework for estimating the cost of representation for subset repairs, i.e., how many extra tuples have to be deleted to satisfy both the integrity constraints (e.g., functional dependencies) and the representation constraint for given sub-populations.
|
Causal What-If and How-To Analysis Using HypeR
Fangzhu Shen*, Kayvon Heravi*, Oscar Gomez, Sainyam Galhotra, Amir Gilad, Sudeepa Roy, Babak Salimi
International Conference on Data Engineering (ICDE), Demonstration Track, 2023.
[pdf]
We demonstrate HypeR which allows users to formulate complex hypothetical queries by using a SQL-like syntax and presents the output as interative visualizations.
|
Duke University, Durham, NC, USA
Ph.D. in Computer Science, January 2023 - Present
|
Duke University, Durham, NC, USA
M.S. in Economics and Computation, August 2021 - December 2022
|
Central University of Finance and Economics, Beijing, China
B.A. in Finance, September 2016 - July 2020
|
PhD Software Engineering Intern, Uber
Coordinated Structural Pricing Team, Summer 2025
Project: Dynamic Structural Estimation for Pricing Model: An End-to-End Learning Framework
To address the challenge of forecasting hidden marketplace parameters, I developed an dynamic learning framework that integrates non-parametric ML models with economic-driven structural pricing model. It improved the accuracy of marketplace estimation and further the accuracy of the pricing decision-making.
|
Teaching Assistant, Introduction to Database Systems, Duke University, Fall 2025
Teaching Assistant, Causal Inference, Fairness, and Explanations in Data Analysis, Spring 2025
Teaching Assistant, Introduction to Databases, Fall 2024
Teaching Assistant, Introduction to Database Systems, Fall 2023
|
|
Shadow PC member: International Conference on Very Large Databases (VLDB) 2026
|
VLDB 2025 Travel Award
ICDE 2023 Travel Award
Duke Scholar Award, 2021-2022.
Academic Scholarships of Central University of Finance and Economics, 2017-2019.
|
|