I'm Yan Chen, a 4th-year PhD student in Decision Sciences at Fuqua School of Business, Duke University, advised by Professor Alexandre Belloni. I obtained my Master in Statistics from Stanford University in 2018 and my Bachelor in Statistics from Nanjing University Department of Mathematics in 2016. I worked as a data scientist in industry from 2018 to 2022.
My research lies broadly at the intersection of operations research, statistics, econometrics and machine learning. I am interested in developing theoretically grounded methods for learning and inference that improve decision-making under uncertainty. Methodologically, my work draws on statistical learning theory, high-dimensional statistics, econometrics, causal inference, online learning and optimization. I apply these tools to problems motivated by artificial intelligence and operations management.
Research
Selected Preprints / Working Papers
Compound Estimation for Binomials
, Lihua Lei
27th ACM Conference on Economics and Computation (EC 2026)
Presented at CODE@MIT (Poster) 2025 ·
Causal Data Science Meeting 2025 (Spotlight) ·
2025 Stanford Causal Science Center Conference (Poster)
arxiv
Many applications involve estimating the mean of multiple binomial outcomes as a common problem – assessing intergenerational mobility of census tracts, estimating prevalence of infectious diseases across countries, and measuring click-through rates for different demographic groups. The most standard approach is to report the plain average of each outcome. Despite simplicity, the estimates are noisy when the sample sizes or mean parameters are small. In contrast, the Empirical Bayes (EB) methods are able to boost the average accuracy by borrowing information across tasks. Nevertheless, the EB methods require a Bayesian model where the parameters are sampled from a prior distribution which, unlike the commonly-studied Gaussian case, is unidentified due to discreteness of binomial measurements. Even if the prior distribution is known, the computation is difficult when the sample sizes are heterogeneous as there is no simple joint conjugate prior for the sample size and mean parameter. In this paper, we consider the compound decision framework which treats the sample size and mean parameters as fixed quantities. We develop an approximate Stein’s Unbiased Risk Estimator (SURE) for the average mean squared error given any class of estimators. For a class of machine learning-assisted linear shrinkage estimators, we establish asymptotic optimality, regret bounds, and valid inference. Unlike existing work, we work with the binomials directly without resorting to Gaussian approximations. This allows us to work with small sample sizes and/or mean parameters in both one-sample and two-sample settings. We demonstrate our approach using three datasets on firm discrimination, education outcomes, and innovation rates.
Adversarial Estimation of Assortment Probabilities under Independence Structure
Alexandre Belloni, , Matthew Harding
27th ACM Conference on Economics and Computation (EC 2026)
Presented at Joint Statistical Meetings 2025 ·
American Causal Inference Conference 2025 ·
California Econometrics Conference 2024 ·
Midwest Econometrics Group Conference 2024
arxiv
code
We consider the problem of estimating assortment probabilities, which is common in operations management applications, including product bundling, advertising, etc. Existing approaches typically model each assortment as a category and apply multinomial models to estimate the choice probabilities; while computationally convenient, these methods do not exploit independence structures in the joint distribution and may therefore be statistically inefficient when the total number of items is large. Using the representation from Bahadur (1959), we relate the sparsity of the generalized correlation coefficients to the independence structure of the binary components. We formulate the problem as estimating a high-dimensional vector of generalized correlation coefficients, together with low or moderate-dimensional nuisance parameters corresponding to the marginal probabilities. We develop a regularized adversarial estimator that attains the optimal rate under standard regularity conditions while remaining computationally feasible. The framework naturally extends to settings with covariates. We apply the proposed estimators to causal inference with multiple binary treatments and show substantial finite-sample improvements over non-adaptive methods. Numerical studies corroborate the theoretical results.
Testing Fairness with Utility Tradeoffs: A Wasserstein Projection Approach
, Zheng Tan, Jose Blanchet, Hanzhang Qin
arxiv
Market Design for Platform-Mediated Influencer Advertising
, Saša Pekeč
Presented at 2025 EC Frontiers of Online Advertising workshop (Spotlight) ·
2025 MSOM main conference
Online Bin Packing with Load Balancing
Zheng Tan, Hanzhang Qin, , Lindong Liu, Yugang Yu
ssrn
Double Distributionally Robust Bid Shading for First Price Auctions
Yanlin Qu, Ravi Kant, , Brendan Kitts, San Gultekin, Aaron Flores, Jose Blanchet
arxiv
Publications
Compound Estimation for Binomials (extended abstract)
, Lihua Lei
27th ACM Conference on Economics and Computation (EC 2026)
Adaptive Estimation of Multivariate Binary Distributions under Sparse Generalized Correlation Structures (extended abstract)
Alexandre Belloni, , Matthew Harding
27th ACM Conference on Economics and Computation (EC 2026)
Optimal Downsampling for Imbalanced Classification with Generalized Linear Models
, Jose Blanchet, Krzysztof Dembczynski, Laura Fee Nern, Aaron Flores
AISTATS 2025
| International Conference on Artificial Intelligence and Statistics
paper
Downsampling or under-sampling is a technique that is utilized in the context of large and highly imbalanced classification models. We study optimal downsampling for imbalanced classification using generalized linear models (GLMs). We propose a pseudo maximum likelihood estimator and study its asymptotic normality in the context of increasingly imbalanced populations relative to an increasingly large sample size. We provide theoretical guarantees for the introduced estimator. Additionally, we compute the optimal downsampling rate using a criterion that balances statistical accuracy and computational efficiency. Our numerical experiments, conducted on both synthetic and empirical data, further validate our theoretical results, and demonstrate that the introduced estimator outperforms commonly available alternatives.
Concurrent Reinforcement Learning with Aggregated States via Randomized Least Squares Value Iteration
, Qinxun Bai, Yiteng Zhang, Maria Dimakopoulou, Shi Dong, Qi Sun, Zhengyuan Zhou
ICML 2025
| International Conference on Machine Learning
paper
Designing learning agents that explore efficiently in a complex environment has been widely recognized as a fundamental challenge in reinforcement learning. While a number of works have demonstrated the effectiveness of techniques based on randomized value functions on a single agent, it remains unclear, from a theoretical point of view, whether injecting randomization can help a society of agents concurrently explore an environment. The theoretical results established in this work tender an affirmative answer to this question. We adapt the concurrent learning framework to randomized least-squares value iteration (RLSVI) with aggregated state representation. We demonstrate polynomial worst-case regret bounds in both finite- and infinite-horizon environments. In both setups the per-agent regret decreases at an optimal rate of Θ(1/√N), highlighting the advantage of concurrent learning. Our algorithm exhibits significantly lower space complexity compared to (Russo, 2019) and (Agrawal et al., 2021). We reduce the space complexity by a factor of K while incurring only a √K increase in the worst-case regret bound, compared to (Agrawal et al., 2021; Russo, 2019). Interestingly, our algorithm improves the worst-case regret bound of (Russo, 2019) by a factor of H1/2, matching the improvement in (Agrawal et al., 2021). However, this result is achieved through a fundamentally different algorithmic enhancement and proof technique. Additionally, we conduct numerical experiments to demonstrate our theoretical findings.
Society of Agents: Regret Bounds of Concurrent Thompson Sampling
, Perry Dong, Qinxun Bai, Maria Dimakopoulou, Wei Xu, Zhengyuan Zhou
NeurIPS 2022
| Advances in Neural Information Processing Systems
paper
We consider the concurrent reinforcement learning problem where n agents simultaneously learn to make decisions in the same environment by sharing experience with each other. Existing works in this emerging area have empirically demonstrated that Thompson sampling (TS) based algorithms provide a particularly attractive alternative for inducing cooperation, because each agent can independently sample a belief environment (and compute a corresponding optimal policy) from the joint posterior computed by aggregating all agents' data, which induces diversity in exploration among agents while benefiting shared experience from all agents. However, theoretical guarantees in this area remain under-explored; in particular, no regret bound is known on TS based concurrent RL algorithms. In this paper, we fill in this gap by considering two settings. In the first, we study the simple finite-horizon episodic RL setting, where TS is naturally adapted into the concurrent setup by having each agent sample from the current joint posterior at the beginning of each episode. We establish a Õ(HS√AT/n) per-agent regret bound, where H is the horizon of the episode, S is the number of states, A is the number of actions, T is the number of episodes and n is the number of agents. In the second setting, we consider the infinite-horizon RL problem, where a policy is measured by its long-run average reward. Here, despite not having natural episodic breakpoints, we show that by a doubling-horizon schedule, we can adapt TS to the infinite-horizon concurrent learning setting to achieve a regret bound of Õ(DS√ATn), where D is the standard notion of diameter of the underlying MDP and T is the number of timesteps. Note that in both settings, the per-agent regret decreases at an optimal rate of Θ(1/√n), which manifests the power of cooperation in concurrent RL.
Talks
August 2026 · Boston, MA
July 2026 · University of Michigan
May 2026 · Online
video · January 2026 · Online
November 2025 · Stanford, CA
November 2025 · Cambridge, MA
November 2025 · Online
October 2025 · Atlanta, GA
August 2025 · Nashville, TN
July 2025 · Columbia University, NYC
May 2025 · Detroit, MI
November 2024 · Lexington, KY
October 2024 · Seattle, WA
September 2024 · UC Davis, CA
October 2023 · Phoenix, AZ
August 2023 · Online
November 2022 · New Orleans, LA
October 2022 · Indianapolis, IN
Teaching
Master of Quantitative Management, Fuqua School of Business
Teaching evaluation: 5.0 / 5.0
Duke Undergraduate
Master of Quantitative Management, Fuqua School of Business
Master of Quantitative Management, Fuqua School of Business
Activities
Fall 2022 – Spring 2023
Appointed by Associate Dean for Graduate Programs · Fall 2023 – Spring 2024