The 2nd Joint Conference on Stattistics and Date Science in China

发布时间:2024-7-04 | 杂志分类:其他
免费制作
更多内容

The 2nd Joint Conference on Stattistics and Date Science in China

2ContentsJuly 12 09:00 - 10:40 Plenary Talk 1: Neural Causal AI: Adversarial Invariance Learning from Heterogeneous Environments Jianqing Fan, Room: Lecture Hall (1st Floor) …………………………………… ##Plenary Talk 2: Genomic Testing in the Presence of Unmeasured Confounding and Missing DataKathryn Roeder, Room: Lecture Hall (1st Floor) …………………………… ##July 12 11:00 - 11:50 ………………………………………………………………………… ##Plenary Talk 3: Multi-Scale Spatial-Temporal Data in Brain Science: Data, Model and Theory Jianfeng Fen... [收起]
[展开]
The 2nd Joint Conference on Stattistics and Date Science in China
粉丝: {{bookData.followerCount}}
文本内容
第1页

The 2nd Joint Conference on Statistics

and Data Science in China

Abstract

Haigeng Convention Center

Kunming Yunnan

July 12-14, 2024

第2页

2

Contents

July 12 09:00 - 10:40

Plenary Talk 1: Neural Causal AI: Adversarial Invariance Learning from Heterogeneous Environments

Jianqing Fan, Room: Lecture Hall (1st Floor) …………………………………… ##

Plenary Talk 2: Genomic Testing in the Presence of Unmeasured Confounding and Missing Data

Kathryn Roeder, Room: Lecture Hall (1st Floor) …………………………… ##

July 12 11:00 - 11:50 ………………………………………………………………………… ##

Plenary Talk 3: Multi-Scale Spatial-Temporal Data in Brain Science: Data, Model and Theory

Jianfeng Feng, Room: Lecture Hall (1st Floor) ………………………………… ##

July 12 14:00 - 15:40 ………………………………………………………………………… ##

Panel Discussion on Developing Statistics in China: History of Statistics in China

Wei Yuan, Room: Yulan Hall (1st Floor) ……………………………………… ##

IS016: Exploring the Frontiers of Machine Learning: Algorithms, Theoretical Insights and Applications

Organizer: Weijie Su, Room: A201-202 (2nd Floor) …………………………………… ##

IS095: Modern Statistical Learning for High-Dimensional Data

Organizer: Jingyuan Liu, Room: A216-217 (2nd Floor) ………………………………… ##

IS005: AI and Machine Learning in Complex Biomedical Data

Organizer: Hongzhe Li, Room: A301-302 (3rd Floor) ………………………………… ##

IS026: Innovative Statistical Methods for Data with Complex Structures

Organizer: Lan Wang, Room: A320-321 (3rd Floor) …………………………………… ##

IS012: Data Science Methods for Complex Data with Endogeneity and Heterogeneity

Organizer: Linda Zhao, Room: A203 (2nd Floor) …………………………………… ##

IS004: Advancing Statistical Frontiers in Data Privacy Protection

Organizer: Weijie Su, Room: A218 (2nd Floor) ……………………………………… ##

IS101: Statistical Inference and Computation for Complex Data

Organizer: Baoxue Zhang, Room: A303 (3rd Floor) ………………………………… ##

IS031: Machine Learning Methods: Theory and Applications

Organizer: Xinyuan Song, Room: A305 (3rd Floor) ………………………………… ##

IS097: Statistical Learning for Threshold Models and Applications

Organizer: Wei Zhong, Room: A306 (3rd Floor) …………………………………… ##

IS075: Statistical Learning for Large Foundation Models

Organizer: Shurong Zheng, Room: A307 (3rd Floor) ………………………………… ##

IS080: Theoretical Foundations for Machine Learning

Organizer: Tracy Ke, Room: A308 (3rd Floor) ……………………………………… ##

IS015: Experimental Design and Big Data Subsampling

Organizer: Niansheng Tang, Room: A315 (3rd Floor) ………………………………… ##

SS2: Bernoulli Session on Stochastic Methods for Data Science

Organizer: Ajay Jasra, Room: A316 (3rd Floor) …………………………………… ##

IS090: Intersection Research of Statistics and Computer Science (统计学与计算机科学的交叉研究)

Organizer: Xingdong Feng, Room: A317 (3rd Floor) ………………………………… ##

IS028: Interface Between Statistics and Neuro and Cognitive Science

Organizer: Song Xi Chen, Room: A318 (3rd Floor) ………………………………… ##

第3页

3

CS003: Recent Advances in Mixture Model

Room: A322 (3rd Floor) ………………………………………………………………… ##

CS004: Statistical Hypothesis Testing in Complex Data

Room: B601 (3rd Floor) ………………………………………………………………… ##

CS001: Recent Advances in Reinforcement Learning

Room: B603 (3rd Floor) ………………………………………………………………… ##

CS006: Interdisciplinary and Applied Research: Statistical Analysis on Medical Data and Models

Room: B606 (3rd Floor) ………………………………………………………………… ##

July 12 16:00 - 18:05

IS072: Statistical Interdisciplinary Studies I

Organizer: Song Xi Chen, Room: Yulan Hall (1st Floor) …………………………………… ##

IS050: Recent Advances in Functional and Complex Data

Organizer: Fang Yao, Room: A201-202 (2nd Floor) …………………………………… ##

IS071: Statistical Inference for High-Dimensional Data

Organizer: Tracy Ke, Room: A216-217 (2nd Floor) ………………………………… ##

IS006: AI and Machine Learning in Single Cell Genomic

Organizer: Hongzhe Li, Room: A301-302 (3rd Floor) ………………………………… ##

IS054: Recent Advances in Statistical Machine Learning

Organizer: Zhenhua Lin, Room: A320-321 (3rd Floor) …………………………………… ##

IS060: Recent Developments in Complex Time Series Analysis

Organizer: Linda Zhao, Room: A203 (2nd Floor) …………………………………… ##

IS045: Recent Advancements in Large Network and Tensor Data Analysis

Organizer: Tracy Ke, Room: A218 (2nd Floor) ……………………………………… ##

IS083: Limit Theory of Large Dimensional Random Matrices (大维随机矩阵极限理论)

Organizer: Shurong Zheng, Room: A303 (3rd Floor) ………………………………… ##

IS099: Economic Statistics and Research on High-Quality Development (经济统计与高质量发展研究)

Organizer: Hu Zhang, Room: A305 (3rd Floor) …………………………………… ##

IS091: Model Averaging and Related Topics

Organizer: Xinyu Zhang, Room: A306 (3rd Floor) …………………………………… ##

IS079: The Interplay Between Statistical Inference and Data-Driven Decision Making

Organizer: Zhimei Ren, Room: A307 (3rd Floor) ………………………………… ##

IS011: Data Science and Engineering

Organizer: Jian Shi, Room: A308 (3rd Floor) ……………………………………… ##

IS051: Recent Advances in High-Dimensional and Heterogeneous Data Analysis

Organizer: Xinyuan Song, Room: A315 (3rd Floor) ………………………………… ##

IS062: Recent Developments in the Analysis of High-Dimensional and Complex Data

Organizer: Jinyuan Chang, Room: A316 (3rd Floor) …………………………………… ##

CS007: Precision Medicine and Survival Data

Room: A317 (3rd Floor) ………………………………………………………………… ##

CS008: Factor Models and Bayesian Analysis

Room: A318 (3rd Floor) ………………………………………………………………… ##

CS009: Statistical Machine Learning: Methodology and Applications

Room: A322 (3rd Floor) ………………………………………………………………… ##

CS010: Recent Advances in Statistical Learning Methods

第4页

4

Room: B601 (3rd Floor) ………………………………………………………………… ##

CS011: Causal Inference and Applications

Room: B603 (3rd Floor) ………………………………………………………………… ##

CS012: Recent Advances in Statistical Inference

Room: B606 (3rd Floor) ………………………………………………………………… ##

July 13 08:30 - 10:10

Plenary Talk 4: Build an End-to-End Scalable and Interpretable Data Science Ecosystem by Integrating

Statistics, ML, and Domain Sciences

Xihong Lin, Room: Lecture Hall (1st Floor) …………………………………… ##

Plenary Talk 5: Statistics and its Applications in Forensic Science and the Criminal Justice System

Alicia Carriquiry, Room: Lecture Hall (1st Floor) ………………………… ##

July 13 10:30 - 11:20

Plenary Talk 6: Generative Adversarial Learning with Optimal Input Dimension and Its Adaptive Generator Architecture

Huazhen Lin, Room: Lecture Hall (1st Floor)……………… ##

July 13 14:00 - 15:40

IS053: Recent Advances in Statistical Learning

Organizer: Tony Cai, Room: Yulan Hall (1st Floor) ……………… ##

IS022: Frontier of Statistics Machine Learning

Organizer: Annie Qu, Room: A201-202 (2nd Floor) ……………… ##

IS036: New Advances in Complex Data Analyses

Organizer: Peter Song, Room: A216-217 (2nd Floor) ……………… ##

IS037: New Statistical Methods for Causal Inference and Hidden Factor Learning

Organizer: Fang Yao, Room: A301-302 (3rd Floor) ……………… ##

IS063: Recent Topics in Machine Learning

Organizer: Zijian Guo, Room: A320-A321 (3rd Floor) ……………… ##

IS025: Independence Test and Association Analysis

Organizer: Liping Zhu, Room: A203 (2nd Floor) ……………… ##

IS066: Semiparametric Modeling for Complex Survival Data

Organizer: Xinyuan Song, Room: A218 (2nd Floor) ……………… ##

IS070: Statistical Inference for Biological and Medical Data

Organizer: Qizhai Li, Room: A303 (3rd Floor) ……………… ##

IS078: Statistics in Earth Science Applications

Organizer: Song Xi Chen, Room: A305 (3rd Floor) ……………… ##

IS065: Robust Inference in High-Dimensional Complex Data

Organizer: Zhanrui Cai, Room: A306 (3rd Floor) ……………… ##

IS052: Recent Advances in Sequencing and Imaging Data Analysis

Organizer: Anru Zhang, Room: A307 (3rd Floor) ……………… ##

IS085: Industrial Big Data and Intelligent Statistical Analysis (工业大数据和智能化统计分析)

Organizer: Jianping Zhu, Room: A308 (3rd Floor) ……………… ##

IS076: Statistical Learning on Multi-Source and Complicated Data

Organizer: Niansheng Tang, Room: A315 (3rd Floor) ……………… ##

SS1: Bernoulli Session on Statistical Methodology & Theory

Organizer: Jeff Yao, Room: A316 (3rd Floor) ……………… ##

第5页

5

IS056: Recent Advances in Statistical Network Analysis - Methodology and Applications

Organizer: Ji Zhu, Room: A317 (3rd Floor) ……………… ##

IS102: Statistical Measurement, Evaluation and Decision(统计测度、评价与决策分会场)

Organizer: Weihua Su, Room: A318 (3rd Floor) ……………… ##

CS019: Statistical Applications in Economics and Medicine

Room: A322 (3rd Floor) ……………… ##

CS016: Statistical Inference in Complex Data Analysis

Room: B601 (3rd Floor) ……………… ##

CS017: Statistical Modeling for Complex Networks

Room: B603 (3rd Floor) ……………… ##

CS018: Complex Data Analysis

Room: B606 (3rd Floor) ……………… ##

July 13 16:00 - 18:05

IS048: Recent Advances in Deep Learning Theory(深度学习理论最新进展)

Organizer: Huazhen Lin, Room: Yulan Hall (1st Floor) ……………… ##

IS082: Trustworthy AI

Organizer: Annie Qu, Room: A201-202 (2nd Floor) ……………… ##

IS021: Foundation Models in Large-Scale Biomedical Studies

Organizer: Ting Li, Room: A216-217 (2nd Floor) ……………… ##

IS077: Statistical Theory and Learning

Organizer: Chinese Society for Probability and Statistics, Room: A301-302 (3rd Floor) … ##

IS027: Innovative Statistical Methods for Heterogeneous Data

Organizer: Lan Wang, Room: A320-A321 (3rd Floor) ……………… ##

IS040: Novel Applications in Biostatistics

Organizer: Annie Qu, Room: A203 (2nd Floor) ……………… ##

IS073: Statistical Interdisciplinary Studies II

Organizer: Song Xi Chen, Room: A218 (2nd Floor) ……………… ##

IS087: Data Science and Business Intelligence Statistical Analysis (数据科学与商业智能统计分析)

Organizer: Jianping Zhu, Room: A303 (3rd Floor) ……………… ##

IS013: Design and Modeling for Computer Experiments (计算机试验的设计与建模)

Organizer: Niansheng Tang, Room: A305 (3rd Floor) ……………… ##

IS001: Advanced Estimation Methods and Machine Learning

Organizer: Xingqiu Zhao, Room: A306 (3rd Floor) ……………… ##

IS003: Advancements in Statistical Inference of Point Processes and Their Applications

Organizer: Jiancang Zhuang, Room: A307 (3rd Floor) ……………… ##

IS007: Asymptotic Theory and High-Dimensional Statistics

Organizer: Takeru Matsuda, Room: A308 (3rd Floor) ……………… ##

IS092: Statistical Network Analysis and Its Application

Organizer: Jialiang Li, Room: A315 (3rd Floor) ……………… ##

IS009: Complex Data, Geometry and Related Fields

Organizer: Zhigang Yao, Room: A316 (3rd Floor) ……………… ##

IS067: Stastistical Analysis with Complex Data

Organizer: Qihua Wang, Room: A317 (3rd Floor) ……………… ##

IS102: Statistical Measurement, Evaluation and Decision(统计测度、评价与决策分会场)

第6页

6

Organizer: Weihua Su, Room: A318 (3rd Floor) ……………… ##

CS021: Matrix Theory and Sufficient Dimension Reduction

Room: A322 (3rd Floor) ……………… ##

CS022: Asymptotic Theory in Probability and Statistics

Room: B601 (3rd Floor) ……………… ##

CS023: Recent Advances in Graphical Models and Image

Room: B603 (3rd Floor) ……………… ##

CS024: Statistical Modeling for Complex Data

Room: B606 (3rd Floor) ……………… ##

July 14 08:30 - 10:10

IS023: Deep Generative Models

Organizer: Jian Huang, Room: Yulan Hall (1

st Floor) ……………… ##

IS096: New Statistical and Machine Learning Methods for Complex Data

Organizer: Depeng Jiang, Room: A201-202 (2nd Floor) ……………… ##

IS024: High-Dimensional Statistical Learning

Organizer: Chinese Society for Probability and Statistics, Room: A216-217 (2nd Floor) …… ##

IS019: Financial Machine Learning

Organizer: Xinghua Zheng, Room: A301-302 (3rd Floor) ……………… ##

IS008: Causal Inference in Observational Studies

Organizer: Chinese Society for Probability and Statistics, Room: A320-A321 (3rd Floor) … ##

IS002: Advancements in Integrative Statistical Inference

Organizer: Wenguang Sun, Room: A203 (2nd Floor) ……………… ##

IS047: Recent Advances in Data Integration in Survey Sampling

Organizer: Jae-Kwang Kim, Room: A218 (2

nd Floor) ……………… ##

IS049: Recent Advances in Efficient and Fair Machine Learning

Organizer: Anru Zhang, Room: A303 (3rd Floor) ……………… ##

IS057: Recent Advances in Statistical Network Analysis - Theory and Methodology

Organizer: Ji Zhu, Room: A305 (3rd Floor) ……………… ##

IS017: Financial and Macroeconometrics

Organizer: Zhijie Xiao, Room: A306 (3rd Floor) ……………… ##

IS020: Foundation Models in Modern Industries

Organizer: Jinhan Xie, Room: A307 (3rd Floor) ……………… ##

IS029: Interface of Functional Data Analysis and Dynamic Models

Organizer: Jiguo Cao, Room: A308 (3rd Floor) ……………… ##

IS033: Modeling and Statistical Inference of Medical Big Data

Organizer: Chinese Society for Probability and Statistics, Room: A315 (3rd Floor) …… ##

CS002: Recent Advances in Deep Learning

Room: A316 (3rd Floor)……………… ##

CS025: Complex Data Modeling

Room: A317 (3rd Floor) ……………… ##

CS026: Change-Points Detection

Room: A318 (3rd Floor) ……………… ##

CS027: Nonparametric Statistical Inference

Room: A322 (3rd Floor) ……………… ##

第7页

7

CS028: Recent Advances in Large-Scale Data

Room: B601 (3rd Floor) ……………… ##

CS029: Interdisciplinary and Applied Research: Statistical Analysis on Medical and Economic Data

Room: B603 (3rd Floor) ……………… ##

CS030: Statistical Modeling and Its Applications

Room: B606 (3rd Floor) ……………… ##

July 14 10:30 - 12:00

IS038: New Statistical Methods for Complex Imaging and Genetics Data

Organizer: Fang Yao, Room: Yulan Hall (1

st Floor) ……………… ##

IS035: Network Analysis and Cluster Analysis

Organizer: Anderson Zhang, Room: A201-202 (2nd Floor) ……………… ##

IS014: Dynamic and Reinforcement Learning

Organizer: Jialiang Li, Room: A216-217 (2nd Floor) ……………… ##

IS043: Panel Data and Microeconometrics

Organizer: Zhijie Xiao, Room: A301-302 (3rd Floor) ……………… ##

IS034: Modern Statistical Methods for Causal Inference

Organizer: Jae-Kwang Kim, Room: A320-A321 (3rd Floor) ……………… ##

IS018: Financial Big Data

Organizer: Xinghua Zheng, Room: A203 (2nd Floor) ……………… ##

IS069: Statistical Inference Beyond Euclidean Spaces

Organizer: Xueqin Wang, Room: A218 (2nd Floor) ……………… ##

IS041: Omics and Big Data in Medical Research(组学与医学大数据)

Organizer: Feng Chen, Room: A303 (3rd Floor) ……………… ##

IS042: Optimality Consideration in Modern Statistical Inference

Organizer: Arlene Kim, Room: A305 (3rd Floor) ……………… ##

IS061: Recent Developments in Conformal Inference and Causal Inference

Organizer: Wenguang Sun, Room: A306 (3rd Floor) ……………… ##

IS039: Nonlinear Probability and Statistics for Machine Learning

Organizer: Zengjing Chen, Room: A307 (3rd Floor) ……………… ##

IS010: Data Privacy and Statistical Modeling

Organizer: Zhigang Li, Room: A308 (3rd Floor) ……………… ##

IS058: Recent Development in Complex Data Analysis

Organizer: Xingdong Feng, Room: A315 (3rd Floor) ……………… ##

CS015: High Dimensional Statistical Inference

Room: A316 (3rd Floor) ……………… ##

CS031: Advance in Statistical Methods for Large and Complex Data

Room: A317 (3rd Floor) ……………… ##

CS032: Complex Data Analysis

Room: A318 (3rd Floor) ……………… ##

CS033: Statistical Modeling and Application of Complex Data

Room: A322 (3rd Floor) ……………… ##

CS034: Statistical Applications in Interdisciplinary Research

Room: B601 (3rd Floor) ……………… ##

CS035: Model Averaging/Cross Disciplinary Research in Statistics

第8页

8

Room: B603 (3rd Floor) ……………… ##

CS036: Feature Screening and High Dimensional Data

Room: B606 (3rd Floor) ……………… ##

July 14 14:00 - 15:40

IS032: Model-Agnostic Statistical Inference

Organizer: Changliang Zou, Room: Yulan Hall (1st Floor) ……………… ##

IS089: Statistical Research on the Digital Economy and Its Impact (数字经济及其影响的统计研究)

Organizer: Xiuying Ma, Room: A201-202 (2nd Floor) ……………… ##

IS046: Recent Advances in Causal Inference

Organizer: Zijian Guo, Room: A216-217 (2nd Floor) ……………… ##

IS093: Mathematical Foundations in AI

Organizer: Qian Lin, Room: A301-302 (3rd Floor) ……………… ##

IS059: Recent Developments in Causal Learning (因果学习最新进展)

Organizer: Huazhen Lin, Room: A320-A321 (3rd Floor) ……………… ##

IS030: Large-Scale Inference and Private Statistical Analysis

Organizer: Qihua Wang, Room: A203 (2nd Floor) ……………… ##

IS068: Statistical Applications in Behavioral Decision and Behavioral Experiments (统计学在行为决策

及行为实验中的应用)

Organizer: Lei Shi, Room: A218 (2nd Floor) ……………… ##

IS074: Statistical Learning for Complex and Challenging Data

Organizer: Jianxin Pan, Room: A303 (3rd Floor) ……………… ##

IS081: Theory, Method and Application for Major Problems in Statistical Modernization of China

Organizer: Yanyun Zhao, Room: A305 (3rd Floor) ……………… ##

IS084: Statistical Modeling and Inference of High-Dimensional Complex Data (高维复杂数据的统计建

模与推断)

Organizer: Xingdong Feng, Room: A306 (3rd Floor) ……………… ##

CS005: Recent Advances in Bayesian Analysis

Room: A307 (3rd Floor) ……………… ##

CS037: Recent Advances in Quantile Regression

Room: A308 (3rd Floor) ……………… ##

CS038: Advance in Statistical Methods for Complex Data

Room: A315 (3rd Floor) ……………… ##

CS039: Advance in Missing Data and Treatment Effects

Room: A316 (3rd Floor) ……………… ##

CS040: Bayesian and Machine Learning

Room: A317 (3rd Floor) ……………… ##

CS014: Ultrahigh Dimensional Statistical Inference

Room: A318 (3rd Floor) ……………… ##

CS042: Bayesian and Nonparametric Statistical Inferences

Room: A322 (3rd Floor) ……………… ##

CS043: Recent Advances in Differentially Private and Complex Data Model

Room: B603 (3rd Floor) ……………… ##

CS044: Complex Data Model

Room: B606 (3rd Floor) ……………… ##

第9页

9

July 14 16:00 - 17:40

IS064: Research on the Statistics and Development of Networked Economic and Social Systems in the

Context of Digital Intelligence Technology

Organizer: Yanyun Zhao, Room: Yulan Hall (1st Floor) ……………… ##

IS055: Recent Advances in Statistical Machine Learning: Theory and Applications

Organizer: Rong Ma, Room: A201-202 (2nd Floor) ……………… ##

IS044: Progress in Best Subset Selection

Organizer: Xueqin Wang, Room: A216-217 (2nd Floor) ……………… ##

IS098: Spatial and Network Econometrics

Organizer: Xingbai Xu, Room: A301-302 (3rd Floor) ……………… ##

IS100: Doctoral Dissertation in Statistical Machine Learning

Organizer: Zhihua Zhang, Room: A320-A321 (3rd Floor) ……………… ##

IS088: Special Topic on Data Asset Accounting (数据资产核算专题)

Organizer: Haiqi Lv, Room: A203 (2nd Floor) ……………… ##

IS086: Business Big Data Analysis and Application (商务大数据分析与应用)

Organizer: Xiuying Ma, Room: A218 (2nd Floor) ……………… ##

CS013: Statistical Inference for Functional/Time Series Data

Room: A303 (3rd Floor) ……………… ##

CS041: Biostatistics and Industrial Statistics

Room: A305 (3rd Floor) ……………… ##

CS053: Complex Statistical Models and Its Applications

Room: A306 (3rd Floor) ……………… ##

CS045: Bayesian and Casual Inference

Room: A307 (3rd Floor) ……………… ##

CS046: Recent Advances in Large Models and Artificial Intelligence (大模型和人工智能的最近研究)

Room: A308 (3rd Floor) ……………… ##

CS047: Statistical Inference in Complex Data

Room: A315 (3rd Floor) ……………… ##

CS048: Factor Model and Community Network

Room: A316 (3rd Floor) ……………… ##

CS049: Bayesian and Machine Learning

Room: A317 (3rd Floor) ……………… ##

CS020: Optimal Subsampling

Room: A318 (3rd Floor) ……………… ##

CS050: Mathematical Statistics and Industrial Statistics

Room: A322 (3rd Floor) ……………… ##

CS051: Quantile Regression and Dimension Reduction

Room: B603 (3rd Floor) ……………… ##

CS052: Statistical Models and Methods in Economics and Finance

Room: B606 (3rd Floor) ……………… ##

Poster

Poster 001-027, South Rest Area (1st Floor)………………… ##

第10页

1

July 12, 9:00-11:50

Plenary Talk 1: Neural Causal AI: Adversarial

Invariance Learning from Heterogeneous Environments

Jianqing Fan

Princeton University

Abstract: This talk develops nonparametric invariance and causal learning from multiple environments

regression models in which data from heterogeneous

experimental settings are collected. The joint distribution of the response variable and covariate may vary

across different environments. Yet, the conditional

expectation of outcome given the unknown set of

important or quasi-causal variables is invariant across

environments. Our idea of invariance and causal

learning is to find a set of variables as exogenous as

possible across multiple environments to minimize the

empirical loss. To realize this idea, we proposed a

Neural Adversial Invariant Learning (NAIL) framework, in which the unknown regression is represented

by a ReLu network, and invariance across multiple

environments is tested using adversarial neural networks. Leveraging the representation power of neural

networks, we introduce neural causal networks based

on a focus adversarial invariance regularization (FAIR)

and its novel training algorithm. It is shown that

FAIR-NN can find the invariant variables and quasi-causal variables and that the resulting procedure is

adaptive to low-dimensional composition structures.

The combinatorial optimization problem is implemented by a Gumble approximation with decreased

temperature and stochastic approximations. The procedures are convincingly demonstrated using simulated examples.

Joint work with: Cong Fang, Yihong Gu, and Peter

Buelhmann.

Plenary Talk 2: Genomic Testing in the Presence of

Unmeasured Confounding and Missing Data

Kathryn Roeder

Carnegie Mellon University

Abstract: When aiming to identify differential genomic outcomes such as gene expression or protein

abundance, thousands of simultaneous hypothesis

tests are routinely performed. These tests can be biased by the presence of unmeasured confounders and

missing data. Recent advances in scRNA-Seq and

CRISPR technologies have allowed for the study of

case vs. control and the characterization of experimental perturbations at single-cell resolution, further

exacerbating these challenges. We develop a

large-scale hypothesis testing solution for multivariate

generalized linear models in the presence of confounding effects. Next, realizing that a number of

advantages can be accrued by taking a causal inference approach, we expand this solution by exploring

doubly robust and proximal inference options as well.

As genomic studies progress from studying transcriptomic to proteomic readouts, new challenges

have arisen, most notably large numbers of missing

values. A common strategy to address this issue is to

rely on an imputed dataset, which often introduces

systematic bias into downstream analyses. By contrast,

we develop a statistical framework inspired by doubly

robust estimators that offers valid and efficient inference for proteomic data. Our framework relies on

powerful machine learning tools, such as variational

autoencoders, to augment the imputation quality with

high-dimensional peptide data.

Plenary Talk 3: Multi-Scale Spatial-Temporal Data

in Brain Science: Data, Model and Theory

Jianfeng Feng

Fudan University

Abstract: In brain science, we have accumulated

many huge datasets spanned from subcellular, cellular

and tissue level data with multi-spatial structures and

evolving with multi-temporal scales. How to first

develop statistical approaches to tackle these structured and often noncontinuous data (point processes

or point fields) is a challenging issue. We will first

review some of the existing methods to analyze the

data. Many novel applications are included to explain

the challenging issues we are facing at the moment.

We then introduce the digital twin approach to model

the whole human brain with 86B neurons and 100T

parameters being estimated. Finally using the first

principle, a new type of neural network, the moment

第11页

2

neuronal network approach, is covered to better approximate the biological neuron network and potentially lead to AGI. Our talk serves as a typical showcase of how we as an applied mathematician can contribute and help the development of a data rich area.

July 12, 14:00-15:40

Invited Session IS016: Exploring the Frontiers of

Machine Learning: Algorithms, Theoretical Insights and Applications

A Very Dutch Scandal: Did Overhyped Stats Ruin

Dutch Appetites?

Fengnan Gao

University College Dublin

Abstract: Applying simple linear regression models,

an economist analyzed a published dataset from an

influential annual ranking in 2016 and 2017 of consumer outlets for Dutch New Herring and concluded

that the ranking was manipulated. His finding was

promoted by his university in national and international media, and this led to public outrage and ensuing discontinuation of the survey. We reconstitute the

dataset, correcting errors and exposing features already important in a descriptive analysis of the data.

The economist has continued his investigations, and in

a follow-up publication repeats the same accusations.

We point out errors in his reasoning and show that

alleged evidence for deliberate manipulation of the

ranking could easily be an artifact of specification

errors. Temporal and spatial factors are both important

and complex, and their effects cannot be captured

using simple models, given the small sample sizes and

many factors determining perceived taste of a food

product. The talk is based on the journal version published in Scandinavian Journal of Statistics and the

cover story published in Significance, the August

2023 issue.

A Statistical Framework of Watermarks for Large

Language Models: Pivot, Detection Efficiency and

Optimal Rules

Xiang Li

University of Pennsylvania

Abstract: Since ChatGPT was introduced in November 2022, embedding (nearly) unnoticeable statistical

signals into text generated by large language models

(LLMs), also known as watermarking, has been used

as a principled approach to provable detection of

LLM-generated text from its human-written counterpart. In this paper, we introduce a general and flexible

framework for reasoning about the statistical efficiency of watermarks and designing powerful detection rules. Inspired by the hypothesis testing formulation of watermark detection, our framework starts by

selecting a pivotal statistic of the text and a secret key

provided by the LLM to the verifier to enable controlling the false positive rate (the error of mistakenly

detecting human-written text as LLM-generated).

Next, this framework allows one to evaluate the power of watermark detection rules by obtaining a

closed-form expression of the asymptotic false negative rate (the error of incorrectly classifying

LLM-generated text as human-written). Our framework further reduces the problem of determining the

optimal detection rule to solving a minimax optimization program. We apply this framework to two representative watermarks one of which has been internally

implemented at OpenAI and obtain several findings

that can be instrumental in guiding the practice of

implementing watermarks. In particular, we derive

optimal detection rules for these watermarks under

our framework. These theoretically derived detection

rules are demonstrated to be competitive and sometimes enjoy a higher power than existing detection

approaches through numerical experiments.

Joint work with Feng Ruan, Huiyuan Wang, Qi Long,

Weijie Su.

Understanding the Implicit Bias of Stochastic

Gradient Descent: A Dynamical Stability Perspective

Lei Wu

Peking University

Abstract: In deep learning, models are often

over-parameterized, which leads to concerns about

algorithms picking solutions that generalize poorly.

Fortunately, stochastic gradient descent (SGD) always

converges to solutions that generalize well even

第12页

3

without needing any explicit regularization, suggesting certain \"implicit regularization\" at work. This talk

will provide an explanation of this striking phenomenon from a stability perspective. Specifically, we

show that a stable minimum of SGD must be flat, as

measured by various norms of local Hessian. Furthermore, these flat minima provably generalize well

for two-layer neural networks and diagonal linear

networks. As opposed to popular continuous-time

analysis, our stability analysis respects the discrete

nature of SGD and can explain the effect of finite

learning rates, batch size, and why SGD often generalizes better than GD.

Algorithms and Incentives in Statistics

Haifeng Xu

University of Chicago

Abstract: A generic question in statistics is to design

approaches that take data as input and output estimation of certain parameters or prediction of some quantities. The standard paradigm often assumes these data

are objectively generated from distributions, without

being affected by any human factors. However, this

paradigm ceases to be true when our predictions or

estimated parameters will in turn affect the data providers' welfare. In such situations, data providers have

incentives to alter the data for their own benefits. Thus

the design of any statistical methods must account for

potential data manipulations due to data providers'

incentives. This talk will introduce a general \"incentive-aware\" framework for designing prediction

methods. I will illustrate this design paradigm with

two examples: (1) a very recent and timely application

of eliciting authors' truthful private information for

improving the peer review systems for today's massive scale machine learning conferences; (2) a very

classic problem of PAC-learning classifiers but with

strategic providers of data features. In both problems,

I will illustrate how the presence of incentives can

fundamentally change the problem's statistical efficiency and how algorithms can help to overcome

some statistical barriers.

Invited Session IS095: Modern Statistical Learning

for High-Dimensional Data

Adaptive Shrinkage Estimation for High- Dimensional Change Point Detection

Yingxing Li

Xiamen University

Abstract: In this paper, we propose an adaptive

sparse group LASSO estimator for high dimensional

change point detection. Our method could simultaneously estimate the change structure as well as the

model parameters for different types of change point

patterns and signal strengths. The penalty parameters

are determined by data driven algorithms. As a result,

it is not necessary to know or pretest whether the

change point is present, or where it occurs. We establish the theoretical property of the proposed estimation even when the magnitudes of change shrinks to

zero. Simulation study and empirical application

demonstrate the excellent performance of our approach.

Joint work with Xiangfu Luo.

A Unifying Dependent Combination Framework

with Applications to Association Tests

Xiufan Yu

University of Notre Dame

Abstract: We introduce a novel meta-analysis

framework to combine dependent tests under a general setting, and utilize it to synthesize various association tests that are calculated from the same dataset.

Our development builds upon the classical meta-analysis methods of aggregating p-values and also a

more recent general method of combining confidence

distributions, but makes generalizations to handle

dependent tests. The proposed framework ensures

rigorous statistical guarantees, and we provide a comprehensive study and compare it with various existing

dependent combination methods. Notably, we demonstrate that the widely used Cauchy combination

method for dependent tests, referred to as the vanilla

Cauchy combination in this article, can be viewed as a

special case within our framework. Moreover, the

proposed framework provides a way to address the

problem when the distributional assumptions underlying the vanilla Cauchy combination are violated.

第13页

4

Our numerical results demonstrate that ignoring the

dependence among the to-be-combined components

may lead to a severe size distortion phenomenon.

Compared to the existing p-value combination methods, including the vanilla Cauchy combination method, the proposed combination framework can handle

the dependence accurately and utilizes the information

efficiently to construct tests with accurate size and

enhanced power.

Joint work with Linjun Zhang, Arun Srinivasan,

Min-ge Xie, and Lingzhou Xue.

Communication-Efficient and Distributed-Oracle

Estimation for High-Dimensional Quantile Regression

Songshan Yang

Renmin University of China

Abstract: In this article, we present a novel communication-efficient estimator for distributed

high-dimensional quantile regression with foldedconcave penalties. An iterative multi-step (IM) algorithm is employed to tackle the nonconvex challenge

of the objective function, taking into account both the

statistical accuracy and the communication constraints.

We demonstrate that the proposed IM estimators share

similar properties with those of the global folded-concave penalized estimator. To establish the theoretical results, we introduce a new concept called

distributed-oracle estimator. We prove that the proposed estimator converges to the distributed-oracle

estimator with high probability. Compared to the

L1-penalized method, the proposed estimator possesses a faster rate of convergence and requires milder

conditions to achieve support recovery. Furthermore,

we extend our framework to facilitate distributed inference for the preconceived low-dimensional components within the high-dimensional model. We derive the limiting distribution of the corresponding test

statistic under the null hypothesis and the local alternatives. In addition, a new feature-splitting algorithm

is devised to accommodate the high-dimensional data

within the distributed system. Extensive numerical

studies demonstrate the effectiveness and validity of

our proposed estimation and inference method. A real

example is also presented for illustration.

Joint work with Yifan Gu, Hanfang Yang and Xuming He.

Efficient Learning of Directed Acyclic Graphs in

Heavy-Tailed Data

Wei Zhou

Southwestern University of Finance and Economics

Abstract: Directed acyclic graph (DAG) models are

widely used to discover causal relationships among

random variables. However, most existing DAG

learning algorithms are not directly applicable to

heavy-tailed data which are commonly observed in

finance and other fields. In this article, we propose a

two-step topological layers based efficient algorithm,

to learn linear DAGs with heavy-tailed error distributions which include Pareto, Frchet, log-normal, Cauchy distributions, and so on. First, we reconstruct the

topological layers hierarchically in a top-down fashion

based on a new reconstruction criterion for

heavy-tailed DAGs without assuming the popularly-employed faithfulness condition. Second, we recover the directed edges via the modified conditional

independent testing for heavy-tailed distributions. We

theoretically demonstrate the consistency of the exact

DAG structures. Monte Carlo simulations validate the

outstanding finite-sample performance of the proposed algorithm compared with competing methods.

In the real data analysis, we analyze the exchange

rates among the 17 OECD countries and uncover the

source of financial contagions and the pathways, some

parts of which may not be detected by existing methods in empirical finance. This helps to identify several

currencies as good options for risk diversification and

reduce the global system risk.

Joint work with Xueqian Kang, Wei Zhong, Junhui

Wang.

Invited Session IS005: AI and Machine Learning in

Complex Biomedical Data

Tackling Biased, Incomplete Data in Electronic

Health Records

Qi Long

University of Pennsylvania

第14页

5

Abstract: Electronic health records (EHR), routinely

collected as part of healthcare delivery, have great

potential to be utilized to advance precision medicine.

They contain multiple years of health information to

be leveraged for risk prediction, disease detection, and

treatment evaluation. However, they do not have a

consistent, standardized format across institutions,

particularly in the United States, and can present significant analytical challenges–they contain multi-scale

data from heterogeneous domains and include both

structured and unstructured data. Data for individual

patients are collected at irregular time intervals and

with varying frequencies. In addition, EHR can reflect

inequity–for example, patients with less access to

healthcare, often people of color or with lower socioeconomic status, tend to have more incomplete data in

EHR. Many of these issues can contribute to biased

data collection. In this talk, I will share our recent

research on developing AI/ML models for addressing

biased, incomplete data in EHR including more accurate assessment of the harmful impact of incomplete

EHR data on algorithmic fairness, challenges associated with mitigating such bias, and potential strategies.

Bias Correction Models for Electronic Health

Records Data in the Presence of Non-random

Sampling

Judy Zhong

New York University

Abstract: Electronic health records (EHRs) contain

rich clinical information for millions of patients and

are increasingly used for public health research.

However, non-random inclusion of subjects in EHRs

can result in selection bias, with factors such as demographics, socioeconomic status, healthcare referral

patterns, and underlying health status playing a role.

While this issue has been well-documented, little

work has been done to develop or apply bias-correction methods, often due to the fact that most

of these factors are unavailable in EHRs. To address

this gap, we propose a series of Heckman type bias

correction methods by incorporating social determinants of health selection covariates to model the EHR

non-random sampling probability. Through simulations under various settings, we demonstrate the effectiveness of our proposed method in correcting biases in both the association coefficient and the outcome mean. Our method augments the utility of EHRs

for public health inferences, as we show by estimating

the prevalence of cardiovascular disease and its correlation with risk factors in the New York City network

of EHRs.

Federated Efficient Estimation of Average Treatment Effects

Rui Duan

Harvard University

Abstract: The expanding opportunities for multi-institutional collaborative research and data integration bring important opportunities in statistical learning and inference but also present significant challenges. This talk addresses the issues of integrating

heterogeneous data from multiple sources to estimate

and infer treatment effects for a specific target population. We explore critical concerns such as data heterogeneity, model misspecifications, and the barriers of

data sharing. To overcome these obstacles, we introduce methods that adapt to source-specific heterogeneity in conditional outcome distributions. Our decentralized approaches allow each site to share only

summary statistics, achieving asymptotic efficiency

achieving asymptotic efficiency equivalent to using

combined individual-level data. This allows us to

estimate the average treatment effect without compromising privacy. We present results from both theoretical and empirical investigations that assess the

performance of our proposed methods across various

settings. Additionally, we discuss the real-world implementation of these methods in large-scale nationwide clinical databases, highlighting the effectiveness

of our approach in diverse and complex data environments.

Invited Session IS026: Innovative Statistical

Methods for Data with Complex Structures

A Stability Approach for Feature Selection with

False Discovery Rate Control

第15页

6

Wei Zhong

Xiamen University

Abstract: In this talk, we first make an overview of

false discovery rate controlling methods for multiple

testing and variable selection for high dimensional

data analysis, including BH method, data-splitting-based method, knockoffs, etc. Although

most of these methods are successfully and widely

used in practice, the results of some methods are unstable due to the inherent randomness. For example,

different runs of model-X knockoffs on the same dataset result in different sets of selected variables due

to the randomness of knockoff data generation. Ren

and Barber (2023) introduced a derandomized

knockoffs method to derandomize model-X knockoffs

via leveraging e-values for false discovery rate control.

But it has non-negligible drawbacks such as the need

to select two FDR parameters and the tendency to

have low Power. To make the statistical results stable

and reproducible, we introduce a general stability

approach for variable selection algorithms with FDR

control. Our approach aggregates e-values generated

from multiple runs of the base algorithm to construct a

stabilized e-value, which leads to higher Power without loss of stability. It is very general and can be applied to almost all FDR control method, such as

knockoffs, data splitting methods. Theoretical properties of this stability method are also studied, such as

asymptotic FDR control guarantee. Extensive numerical experiments and real data applications demonstrate that the proposed method is generally more

powerful and stable than the existing competitors.

High-Dimensional Scale Invariant Discriminant

Analysis

Shurong Zheng

Northeast Normal University

Abstract: In this paper, we propose a scale invariant

linear discriminant analysis classifier for

high-dimensional data. The method is valid for both

cases that the data dimension is smaller or greater than

the sample size. The method is also suitable for

missing data. Based on recent advances of the sample

correlation matrix in random matrix theory, we derive

the asymptotic limits of the error rate which characterizes the influences of the data dimension and the

tuning parameter. The major advantage of our proposed classifier is scale invariant and it is applicable

to any variances of the feature. Several numerical

studies are investigated and our proposed classifier

performs favorably in comparison to some existing

methods.

On Functional Processes with Multiple Discontinuities

Yaguang Li

University of Science and Technology of China

Abstract: We consider the problem of estimating

multiple change points for a functional data process.

There are numerous examples in science and finance

in which the process of interest may be subject to

some sudden changes in the mean. The process data

that are not in a close vicinity of any change point can

be analysed by the usual nonparametric smoothing

methods. However, the data close to change points

and contain the most pertinent information of structural breaks need to be handled with special care. This

paper considers a half-kernel approach that addresses

the inference of the total number, locations and jump

sizes of the changes. Convergence rates and asymptotic distributional results for the proposed procedures

are thoroughly investigated. Simulations are conducted to examine the performance of the approach, and a

number of real data sets are analysed to provide an

illustration.

Structured Feature Ranking for Genomic Marker

Identification Accommodating Multiple Types of

Networks

Xingdong Feng

Shanghai University of Finance and Economics

Abstract: Numerous statistical methods have been

developed to search for genomic markers associated

with the development, progression, and response to

treatment of complex diseases. Among them, feature

ranking plays a vital role due to its intuitive formulation and computational efficiency. However, most of

the existing methods are based on the marginal im-

第16页

7

portance of molecular predictors and share the limitation that the dependence (network) structures among

predictors are not well accommodated, where a disease phenotype usually reflects various biological

processes that interact in a complex network. In this

paper, we propose a structured feature ranking method

for identifying genomic markers, where such network

structures are effectively accommodated using Laplacian regularization. The proposed method innovatively investigates multiple network scenarios, where the

networks can be known a priori and data-dependently

estimated. In addition, we rigorously explore the noise

and uncertainty in the networks and control their impacts with proper selection of tuning parameters.

These characteristics make the proposed method enjoy

especially broad applicability. Theoretical result of our

proposal is rigorously established. Compared to the

original marginal measure, the proposed network

structured measure can achieve sure screening properties with a faster convergence rate under mild conditions. Extensive simulations and analysis of The Cancer Genome Atlas melanoma data demonstrate the

improvement of finite sample performance

and practical usefulness of the proposed method.

Joint work with Yeheng Ge, Tao Li, Mengyun Wu.

Invited Session IS012: Data Science Methods for

Complex Data with Endogeneity and Heterogeneity

BELIEF in Dependence: Leveraging Atomic Linearity in Data Bits for Rethinking Generalized Linear Models

Kai Zhang

University of North Carolina, Chapel Hill

Abstract: Two linearly uncorrelated binary variables

must be also independent because non-linear dependence cannot manifest with only two possible states.

This inherent linearity is the atom of dependency constituting any complex form of relationship. Inspired

by this observation, we develop a framework called

binary expansion linear effect (BELIEF) for understanding arbitrary relationships with a binary outcome.

Models from the BELIEF framework are easily interpretable because they describe the association of binary variables in the language of linear models, yielding convenient theoretical insight and striking Gaussian parallels. With BELIEF, one may study generalized linear models (GLM) through transparent linear

models, providing insight into how the choice of link

affects modeling. For example, setting a GLM interaction coefficient to zero does not necessarily lead to

the kind of no-interaction model assumption as understood under their linear model counterparts. Furthermore, for a binary response, maximum likelihood

estimation for GLMs paradoxically fails under complete separation, when the data are most discriminative, whereas BELIEF estimation automatically reveals the perfect predictor in the data that is responsible for complete separation. We explore these phenomena and provide related theoretical results. We

also provide preliminary empirical demonstration of

some theoretical results.

Joint work with Benjamin Brown, Xiao-Li Meng.

A Scalable, Interpretable, and Data-Driven Approach to Analyzing Unstructured Information

Wu Zhu

Tsinghua University

Abstract: We introduce a general framework for analyzing large-scale text-based data, combining the

strengths of neural network language processing and

generative statistical modeling to create a factor

structure of unstructured data for downstream regressions used in social sciences. We generate textual

factors by (i) representing texts using vector word

embedding, (ii) clustering words using locality-sensitive hashing, and (iii) identifying spanning

clusters/factors through topic modeling. Our data-driven approach captures complex linguistic structures while ensuring computational scalability and

economic interpretability. We also discuss applications

of textual factors in (i) prediction and inference, (ii)

interpreting (non-text-based) models and variables,

and (iii) constructing new text-based metrics and explanatory variables, with illustrations using topics in

finance and economics such as macroeconomic forecasting and factor asset pricing. Finally, we provide a

第17页

8

flexible statistical package of textual factors for online

distribution to facilitate future applications.

Regularizing BELIEF for Smooth Dependency

Wan Zhang

University of North Carolina, Chapel Hill

Abstract: As the complexity of models and the volumes of data increase, interpretable methods for modeling complicated dependence are in great need. A

recent frame work of binary expansion linear effect

(BELIEF) provides a \"divide and conquer\" approach

to decompose any complex form of dependency into

small linear regressions over data bits. Although BELIEF can be used to approximate any relationship, it

faces an important challenge of high dimensionality.

To overcome this obstacle, we propose a novel definition of smoothness for binary interactions and create a

regularization of BELIEF under smoothness interpretations. We prove that there is a one-one correspondence between each marginal binary interaction and the

smoothness we defined. Additionally, we have shown

that in higher dimensions, the smoothness can be expressed as a product of that for marginal binary interactions. Based on these observations, we propose to

model the smooth form of dependency with a generalized LASSO model with larger penalty on less

smooth terms. The numerical studies show that the

smooth LASSO takes advantages in clear interpretability and effectiveness for nonlinear and high dimensional data.

Joint work with Heyang Ni, Yufeng Liu, Kai Zhang.

Personalized Reinforcement Learning for

Healthcare: With Applications to Sepsis Management in ICU

Linda Zhao

University of Pennsylvania

Abstract: In numerous fields such as healthcare, public policy, and e-commerce, a primary objective is to

make multiple decisions simultaneously in a dynamic

and personalized fashion. This sequential decision-making process is especially relevant in

healthcare for developing personalized treatment

plans. The main challenge stems from the dynamic

and personalized nature of the process -- each patient's

history and unique responses to treatments significantly influence their current and future care. To tackle these challenges, we develop a personalized reinforcement learning algorithm that provides optimal

and interpretable personalized treatment decisions.

Focusing on sepsis management in ICUs, a condition

as the main cause of mortality in hospitals accounting

for more than 20 billion of total costs yet no consensus on optimal treatment strategies, we demonstrate

the value of our algorithm on the ICU data from five

Boston hospitals. We show that our algorithm can

outperform standard care by providing more effective

and personalized treatment plans for sepsis patients,

showcasing the potential of our approach to improve

outcomes and reduce costs in complex healthcare

settings.

Invited Session IS004: Advancing Statistical Frontiers in Data Privacy Protection

Gaussian Differential Privacy on Riemannian

Manifolds

Linglong Kong

University of Alberta

Abstract: We develop an advanced approach for extending Gaussian Differential Privacy (GDP) to general Riemannian manifolds. The concept of GDP

stands out as a prominent privacy definition that

strongly warrants extension to manifold settings, due

to its central limit properties. By harnessing the power

of the renowned Bishop-Gromov theorem in geometric analysis, we propose a Riemannian Gaussian distribution that integrates the Riemannian distance,

allowing us to achieve GDP in Riemannian manifolds

with bounded Ricci curvature. To the best of our

knowledge, this work marks the first instance of extending the GDP framework to accommodate general

Riemannian manifolds, encompassing curved spaces,

and circumventing the reliance on tangent space

summaries. We provide a simple algorithm to evaluate

the privacy budget μ on any one-dimensional manifold and introduce a versatile Markov Chain Monte

Carlo (MCMC)-based algorithm to calculate μ on any

Riemannian manifold with constant curvature.

第18页

9

Through simulations on one of the most prevalent

manifolds in statistics, the unit sphere Sd, we demonstrate the superior utility of our Riemannian Gaussian

mechanism in comparison to the previously proposed

Riemannian Laplace mechanism for implementing

GDP.

Joint work with Yangdi Jiang, Xiaotian Chang, Yi

Liu, Lei Ding, Bei Jiang.

Unveiling Enhanced Privacy in Data Science via

Statistical Methods

Chendi Wang

University of Pennsylvania

Abstract: Recently, f-differential privacy (f-DP),

which evaluates the privacy level of an algorithm

from a hypothesis testing perspective using trade-off

functions, has been established. However, accurately

counting the privacy level of a privacy-preserving

algorithm in practical applications is challenging due

to the co-existence of multiple algorithm modules. In

this talk, we demonstrate that f-DP provides

state-of-the-art privacy analysis for various applications. We first apply f-DP to assess the privacy level

of the U.S. Census data, a critical application of differential privacy. Our analysis shows that achieving

the same privacy level requires less noise when using

f-DP compared to the zero-concentrated differential

privacy method currently used by the Census Bureau,

thereby enhancing the utility of privatized Census data.

Additionally, we propose an inequality for f-DP to

handle mixture distributions caused by machine

learning algorithms, which implies the joint convexity

of F-divergence. This inequality is shown to be tight

in widely used shuffling models. Applying this inequality to federated learning, we demonstrate that

f-DP can improve the privacy-utility tradeoff in federated learning.

Joint work with Buxin Su, Xiang Li, Jiayuan Ye, Qi

Long, Reza Shokri, Weijie Su.

Differentially Private Estimation and Inference in

High-Dimensional Regression with FDR Control

Zhanrui Cai

The University of Hong Kong

Abstract: This paper presents novel methodologies

for conducting practical differentially private (DP)

estimation and inference in high-dimensional linear

regression. We start by proposing a differentially private Bayesian Information Criterion for selecting the

unknown sparsity parameter in DP-sparse linear regression, eliminating the need for prior knowledge of

model sparsity, a requisite in the existing literature.

Then we propose a differentially private debiased

algorithm that enables privacy-preserving inference

on a particular subset of regression parameters. Our

proposed method enables accurate and private inference on the regression parameters by leveraging the

inherent sparsity of high-dimensional linear regression

models. Additionally, we address the private feature

selection by considering multiple testing in

high-dimensional linear regression by introducing a

differentially private multiple testing procedure that

controls the false discovery rate (FDR). This allows

for accurate and privacy-preserving identification of

significant predictors in the regression model.

Through extensive simulations and real data analysis,

we demonstrate the efficacy of our proposed methods

in conducting inference for high-dimensional linear

models while safeguarding privacy and controlling the

FDR.

Joint work with Sai Li, Xintao Xia, Linjun Zhang.

Online Local Differential Private Quantile Inference via Self-normalization

Bei Jiang

University of Alberta

Abstract: Based on binary inquiries, we developed an

algorithm to estimate population quantiles under Local Differential Privacy (LDP). By self-normalizing,

our algorithm provides asymptotically normal estimation with valid inference, resulting in tight confidence

intervals without the need for nuisance parameters to

be estimated. Our proposed method can be conducted

fully online, leading to high computational efficiency

and minimal storage requirements with space. We also

proved an optimality result by an elegant application

of one central limit theorem of Gaussian Differential

Privacy (GDP) when targeting the frequently encoun-

第19页

10

tered median estimation problem. With mathematical

proof and extensive numerical testing, we demonstrate

the validity of our algorithm both theoretically and

experimentally.

Joint work with Yi Liu, Qirui Hu, Lei Ding,

Linglong Kong.

Invited Session IS101: Statistical Inference and

Computation for Complex Data

Statistical Analysis of Averaged Weighted Gradient

Descent Algorithm for Decentralized Federated

Learning

Yue Chen

Capital University of Economics and Business

Abstract: In recent years, decentralized federated

learning has become increasingly important for training collaborative models without sharing sensitive

data. Although its numerical convergence theory and

communication efficiency have been well developed

in the literature, its statistical property has received

little attention, especially for the unbalanced data,

where the amounts of data across different clients vary

greatly. To this end, in this paper we propose an innovative Averaged Weighted Gradient Descent algorithm,

AWGD, based on a circle-type network structure.

Theoretically, we start with a linear regression model,

and then find that larger learning rate leads to faster

numerical convergence but worse statistical efficiency.

The resulting AWGD estimator is asymptotically efficient, if the learning rate is appropriate and the data

are distributed homogeneously, even if the data are

unbalanced. Those interesting findings are further

extended to general models, general loss functions and

heterogeneous data. Numerically, studies on simulated

and real data demonstrate that, under the same convergence rate, the proposed AWGD estimator has

superior statistical efficiency compared to the existing

competitors. More importantly, our numerical experiment results show that even for unbalanced data, the

proposed AWGD estimators are statistically as efficient as the global ones, if the learning rate is sufficiently small.

Hypothesis Testing in High Dimensional Linear

Regression via Wild Bootstrapping

Wenjuan Hu

Capital University of Economics and Business

Abstract: In recent years, U-statistic type of tests

have been proposed for testing linear hypotheses on

regression parameters in high dimensional linear

models. We investigate the distributional property of

the test statistic under a more general setting under

null and a local alternative hypothesis. Different from

previous studies, we found that the test statistic's asymptotic distribution is given by the sum of a normal

random variable and a mixed chi-square random variable. Previous test theories based on asymptotic normality can be viewed as a special case of our more

general theory. We further proposed using wild bootstrap with U-centering for practical implementation of

the new test theory. Our new test is shown to more

accurately control type-I error rates under more general settings. Simulation and real data examples further demonstrate the merit of our tests.

Joint work with Nan Lin, Baoxue Zhang.

Sequential Quantile Regression for Stream Data by

Least Squares

Ye Fan

Capital University of Economics and Business

Abstract: Massive stream data are common in modern economics applications, such as e-commerce and

finance. They cannot be permanently stored due to

storage limitation, and real-time analysis needs to be

updated frequently as new data become available. In

this work, we develop a sequential algorithm, SQR, to

support efficient quantile regression (QR) analysis for

stream data. Due to the non-smoothness of the check

loss, popular gradient-based methods do not directly

apply. Our proposed algorithm, partly motivated by

the Bayesian QR, converts the non-smooth optimization into a least squares problem and is hence significantly faster than existing algorithms that all require

solving a linear programming problem in local processing. We further extend the SQR algorithm to

composite quantile regression (CQR), and prove that

the SQR estimator is unbiased, asymptotically normal

and enjoys a linear convergence rate under mild con-

第20页

11

ditions. We also demonstrate the estimation and inferential performance of SQR through simulation

experiments and a real data example on a US used car

price data set.

Summary Statistics-Based Association Test for

Identifying the Pleiotropic Effects with Set of Genetic Variants

Deliang Bu

Capital University of Economics and Business

Abstract:Traditional genome-wide association study

focuses on testing one-to-one relationship between

genetic variants and complex human diseases or traits.

While its success in the past decade, this one-to-one

paradigm lacks efficiency because it does not utilize

the information of intrinsic genetic structure and pleiotropic effects. Due to privacy reasons, only summary

statistics of current genome-wide association study

data are publicly available. Existing summary statistics-based association tests do not consider covariates

for regression model, while adjusting for covariates

including population stratification factors is a routine

issue.

In this work, we first derive the correlation coefficients between summary Wald statistics obtained from

linear regression model with covariates. Then, a new

test is proposed by integrating three-level information

including the intrinsic genetic structure, pleiotropy,

and the potential information combinations. Extensive

simulations demonstrate that the proposed test outperforms three other existing methods under most of

the considered scenarios. Real data analysis of polyunsaturated fatty acids further shows that the proposed

test can identify more genes than the compared existing methods.

Invited Session IS031: Machine Learning Methods:

Theory and Applications

Towards Non-Asymptotic Convergence for Diffusion-Based Generative Models

Gen Li

The Chinese University of Hong Kong

Abstract: Diffusion models, which convert noise into

new data instances by learning to reverse a Markov

diffusion process, have become a cornerstone in contemporary generative modeling. While their practical

power has now been widely recognized, the theoretical underpinnings remain far from mature. In this

work, we develop a suite of non-asymptotic theory

towards understanding the data generation process of

diffusion models in discrete time, assuming access to

?2-accurate estimates of the (Stein) score functions.

For a popular deterministic sampler (based on the

probability flow ODE), we establish a convergence

rate proportional to 1/? (with ? the total number of

steps), improving upon past results; for another mainstream stochastic sampler (i.e., a type of the denoising

diffusion probabilistic model), we derive a convergence rate proportional to 1/√? , matching the

state-of-the-art theory. Imposing only minimal assumptions on the target data distribution (e.g., no

smoothness assumption is imposed), our results characterize how ?2 score estimation errors affect the

quality of the data generation process.

Dual-Directed Algorithm Design for Efficient Pure

Exploration

Wei You

The Hong Kong University of Science and Technology

Abstract: We consider pure-exploration problems in

the context of stochastic sequential adaptive experiments with a finite set of alternative options. The goal

of the decision-maker is to accurately answer a query

question regarding the alternatives with high confidence with minimal measurement efforts. A typical

query question is to identify the alternative with the

best performance, leading to best-arm identification in

the machine learning literature. We focus on the

fixed-confidence setting and by incorporating the dual

variables directly, we characterize the necessary and

sufficient conditions for an allocation to be optimal.

The use of dual variables allow us to bypass the combinatorial structure of the optimality conditions that

relies solely on primal variables. Remarkably, these

optimality conditions enable an extension of top-two

algorithm design principle (Russo, 2020), initially

proposed for best-arm identification. Furthermore, our

第21页

12

optimality conditions give rise to a straightforward yet

efficient selection rule, termed information-directed

selection, which adaptively picks from a candidate set

based on information gain of the candidates. We establish that, paired with information-directed selection,

top-two Thompson sampling is (asymptotically) optimal for Gaussian best-arm identification, solving a

glaring open problem in the pure exploration literature.

Moreover, our analysis also leads to a general principle to guide adaptations of Thompson sampling for

pure-exploration problems. Numerical experiments

highlight the exceptional efficiency of our proposed

algorithms relative to existing ones.

Joint work with Chao Qin.

Inference for High Dimensional Proportional Hazards Model with Streaming Survival Data

Haijin He

Shenzhen University

Abstract: We propose an online inference procedure

for high dimensional streaming survival data based on

the proportional hazards model. We offer an online

Lasso method for regression parameter estimation and

establish the non-asymptotic error bounds of the corresponding Lasso estimators for the regression parameter vector. In addition, we study the pointwise and

group inference for the regression parameters by utilizing a debiased Lasso method. Extensive simulations

are conducted to evaluate the finite sample performance of the proposed method. The results show good

performance of the proposed method. An application

to a colon cancer dataset is provided to demonstrate

the practical utility of the proposed methodology.

Screen Then Select: A Strategy for Correlated Predictors in High-Dimensional Quantile Regression

Xuejun Jiang

Southern University of Science and Technology

Abstract: Strong correlation among predictors and

heavy-tailed noises pose a great challenge in the analysis of ultra-high dimensional data. Such challenge

leads to an increase in the computation time for discovering active variables and a decrease in selection

accuracy. To address this issue, we propose an innovative two-stage screen-then-select approach and its

derivative procedure based on a robust quantile regression with sparsity assumption. This approach

initially screens important features by ranking quantile

ridge estimation and subsequently employs a likelihood-based post-screening selection strategy to refine

variable selection. Additionally, we conduct an internal competition mechanism along the greedy search

path to enhance the robustness of algorithm against

the design dependence. Our methods are simple to

implement and possess numerous desirable properties

from theoretical and computational standpoints. Theoretically, we establish the strong consistency of feature selection for the proposed methods under some

regularity conditions. In empirical studies, we assess

the finite sample performance of our methods by

comparing them with utility screening approaches and

existing penalized quantile regression methods. Furthermore, we apply our methods to identify genes

associated with anticancer drug sensitivities for practical guidance.

Joint work with Yakun Liang, Haofeng Wang.

Invited Session IS097: Statistical Learning for

Threshold Models and Applications

Lasso and Post-Lasso Inference for Multiple

Threshold Regressions with an Application to Return Predictability

Chenchen Ma

Peking University

Abstract: This paper considers a multiple threshold

regression model, where the coefficient parameters

can switch between regimes according to the value of

a threshold variable, and establishes the valid inference of a Lasso-type shrinkage estimation procedure

that consistently estimates the multiple thresholds.

The procedure is robust to both diverging number of

thresholds and shrinking threshold effects. Asymptotic

properties, including the consistency of the group

Lasso estimators and threshold number estimator, and

limiting distribution of the threshold estimators and

the likelihood ratio statistic, are established under a

set of regularity conditions. The focus is further

placed on the new development of the post-Lasso

第22页

13

inferential theory, which accounts for the randomness

of threshold selection and is achieved by characterizing the distribution of the coefficient estimators conditional on the selected model. Monte Carlo simulations demonstrate that the estimators are well-behaved

in finite samples. An empirical application to return

prediction further illustrates the practical merits of our

methodology.

Joint work with Yundong Tu.

Multi-Threshold Regression with Endogeneity

Chuang Wan

Nankai University

Abstract: This article develops a comprehensive estimation and inference framework for multi-threshold

regression, employing instrumental variables. One

major challenge is determining the number of threshold points. We first propose a modified information

criterion and show its consistency under mild conditions. However, its practical utility is sometimes compromised due to its sensitivity to the choice of penalization magnitude, creating a gap between theory and

practice. To bridge this gap, we exploit a

cross-validation criterion alongside an order-preserved

sample-splitting strategy tailored specifically for

threshold regression. The new criterion is completely

data-driven and therefore more convenient for practical use. We then formulate hypotheses serving distinct

purposes: the presence of threshold effects and the

existence of endogeneity. In cases where regressors

and threshold variable are both endogenous, the proposed approaches remain applicable with slight adjustments using the control function framework. Extensive simulation experiments validate the reliable

performance of our methodologies in finite sample

cases. We finally conduct an empirical application to

explore the 401(k) retirement plans dataset for which

some new findings are discovered.

Robust Estimation of Structural Instability in the

Large-Dimensional Factor Model

Wei Wang

Shandong University of Finance and Economics

Abstract: The numerous economic and financial empirical researches have verified that the distribution of

many economic variables exhibit the heavy tailedness

and structural instability. In this paper, we consider

the estimation of structural instability in

large-dimensional factor model with heavy-tailed

distributions. There exists an unknown structural

break in the factor loadings. We estimate the structural

break via minimizing the piece-wise Huber principal

component analysis (HPCA) and give the relaxed

conditions, such as relaxing the higher order moment,

showing that the estimator is consistent for the break

date. The Monte Carlo simulation are designed to

compare the finite performance with the classical

estimators by the principal component analysis. Last,

we also estimated the structural break in the U.S.

stock market and U.S. macro-economic data, respectively.

Common Threshold Estimation in Large Heterogeneous Panels with a Multifactor Error Structure

Yimeng Xie

Xiamen University

Abstract: This paper studies large panel heterogeneous models with common threshold effect and multifactor error structure, where the threshold effect is

allowed to influence both coefficients of observed

regressors and loadings of latent factors. To estimate

the coefficients and the threshold parameter, we consider auxiliary regressions where cross sectional

aver-ages of observed individual-specific covariates

are used to augment regressors, and propose a simple

concentrated least squares estimation procedure, in

which estimation of factor number is not needed. It is

shown the estimator of threshold parameter is super

consistent and the convergence rate depends on dimensions of both time periods (T) and cross sectional

units (n), while the estimators of coefficients are

√T-consistent and has asymptotic normality. In addition, we propose a test of linearity to examine the

existence of the common threshold. Monte Carlo simulations are provided and show that our proposed

estimators and test have satisfactory finite sample

performances. Finally, an empirical application about

asset pricing is presented to illustrate how to adjust

第23页

14

portfolio via our proposed methods.

Joint work with Yanbo Liu, Rui Chen.

Invited Session IS075: Statistical Learning for

Large Foundation Models

Learning Prediction Function of Prior Measures

for Statistical Inverse Problems of Partial Differential Equations

Junxiong Jia

Xi'an Jiaotong University

Abstract: In this study, we formulate statistical inverse problems involving partial differential equations

(PDEs) as PDE-constrained regression problems, with

a focus on learning the predictive functions of prior

probability measures. Adopting this viewpoint, we

introduce general generalization bounds for infinite-dimensional prior measures within the framework

of the probability approximately correct learning theory. Our theoretical framework is meticulously constructed on infinite-dimensional separable Banach

spaces, closely linking it to conventional infinite-dimensional Bayesian inverse methods. Motivated by the notion of α-differential privacy, we advance

a more general condition that includes the standard

Gaussian measures prevalent in statistical inverse

problems. This condition permits the learned prior

measures to be contingent on the observed data.

Through a series of pivotal theoretical demonstrations,

we derive concrete generalization bounds suitable for

both linear and nonlinear inverse problems, in a form

that can integrate typical PDEs. Utilizing these derived bounds, we construct well-defined practical

algorithms in infinite-dimensional spaces. To illustrate

the potential applications of our proposed methodology, we present numerical examples that showcase its

effectiveness in learning the predictive functions of

prior probability measures.

Advancements in Understanding

Over-Parameterized Deep Equilibrium Models:

Bridging Theory and Practice

Zenan Ling

Huazhong University of Technology

Abstract: A deep equilibrium model (DEQ) is implicitly defined through an equilibrium point of an infinite-depth weight-tied model with an input-injection.

Instead of infinite computations, it solves an equilibrium point directly with root-finding and computes

gradients with implicit differentiation. As a typical

implicit neural network (NN), DEQ has recently

emerged as a new neural network design paradigm,

demonstrating remarkable success on various tasks.

Nevertheless, the theoretical understanding of DEQs

is still limited. In this talk, we will introduce several

recent advancements in the theoretical comprehension

of over-parameterized DEQs: (1) a novel

non-asymptotic framework to establish the global

convergence of the gradient descent (GD) associated

with an over-parameterized DEQ; (2) a novel asymptotic framework for establishing the equivalence between implicit DEQs and explicit NNs in high dimensions. These findings leverage recent advances in

high-dimensional analysis and random matrix theory.

Simulation Learning Methodology: Theory, Algorithms, and Applications

模拟学习方法论:理论,算法和应用

Jun Shu

Xi'an Jiaotong University

摘要:近几年,人工智能研究的突破之一是以

ChatGPT 为代表的大模型的显著发展。相比于以解

决特定任务为特征的传统深度学习模型,大模型在

解决跨任务泛化等复杂任务中展示出惊人的能力

(称为涌现能力)。但大模型 “大力出奇迹”的实现

模式与“资源稀疏型”的学术研究模式存在鸿沟。本

报告将从大模型的约化这一问题展开讨论,介绍基

于“模型学习方法论”(SLeM)的元学习框架,阐述

背后关于任务迁移泛化的统计学习理论,并以机器

学习自动化作为典型应用场景展示,研发了一系列

机器学习自动化基础算法簇,揭示 SLeM 学习范式

对现实场景中的潜在适用性。

Understanding and Improving LLM Training:

Insights into Adam and Advent of Adam-Mini

Ruoyu Sun

The Chinese University of Hong Kong (Shenzhen)

Abstract: Adam is the default algorithm for training

large foundation models. In this talk, we aim to un-

第24页

15

derstand why Adam is better than SGD on training

large foundation models, and propose a

memory-efficient alternative called Adam-mini. First,

we provide an explanation of the failure of SGD on

transformer: (i) Transformers are \"heterogeneous\": the

Hessian spectrum across parameter blocks vary dramatically; (ii) Heterogeneity hampers SGD: SGD

performs badly on problems with block heterogeneity.

Second, motivated by this finding, we introduce Adam-mini, which partitions the parameters according to

the Hessian structure and assigns a single second

momentum term to all weights in a block. We empirically show that Adam-mini saves 45-50% memory

over Adam without compromising performance, on

various models including 7B-size language models

and ViT.

Invited Session IS080: Theoretical Foundations for

Machine Learning

Enhanced Topic Modeling using Entry-Wise Eigenvector Analysis

Tracy Ke

Harvard University

Abstract: Topic modeling is a widely used tool in text

analysis, aiming to extract meaningful 'topics' from a

collection of documents. This paper investigates the

optimal statistical guarantees for estimating a topic

model. The authors introduce a new normalized word

count matrix and provide a sharp entry-wise eigenvector analysis for this matrix. These results are used

to enhance an existing spectral algorithm, Topic-SCORE, for topic modeling. The authors demonstrate that the error rate of the enhanced algorithm is

minimax optimal across the entire parameter regime

of interest. Compared to existing results, the improvement is particularly significant in the challenging regime where all documents are short. Joint work

with Jingming Wang.

Network Tight Community Detection

Huimin Cheng

Boston University

Abstract: Conventional community detection methods often categorize all nodes into clusters. However,

the presumed community structure of interest may

only be valid for a subset of nodes (named as \"tight

nodes''), while the rest of the network may consist

of noninformative \"scattered nodes''. For example, a

protein-protein network often contains proteins that do

not belong to specific biological functional modules

but are involved in more general processes, or act as

bridges between different functional modules. Forcing

each of these proteins into a single cluster introduces

unwanted biases and obscures the underlying biological implication. To address this issue, we propose a

tight community detection (TCD) method to identify

tight communities excluding scattered nodes. The

algorithm enjoys a strong theoretical guarantee of

tight node identification accuracy and is scalable for

large networks. The superiority of the proposed

method is demonstrated by various synthetic and real

experiments.

Joint work with Jiayi Deng, Xiaodong Yang, Jun Yu,

Jun Liu, Zhaiming Shen, Danyang Huang.

Is Your Data Alignable? A Geometric Approach to

Single-Cell Data Integration

Rong Ma

Harvard University

Abstract: Single-cell data integration can provide a

comprehensive molecular view of cells, and many

algorithms have been developed to remove unwanted

technical or biological variations and integrate heterogeneous single-cell datasets. Despite their wide usage, existing methods suffer from several fundamental

limitations. In particular, we lack a rigorous statistical

test for whether two high-dimensional single-cell

datasets are alignable (and therefore should even be

aligned). Moreover, popular methods can substantially

distort the data during alignment, making the aligned

data and downstream analysis difficult to interpret. To

overcome these limitations, we present a spectral

manifold alignment and inference (SMAI) framework,

which enables principled and interpretable alignability

testing and structure-preserving integration of single-cell data with the same type of features. SMAI

provides a statistical test to robustly assess the alignability between datasets to avoid misleading inference

第25页

16

and is justified by high-dimensional statistical theory.

On a diverse range of real and simulated benchmark

datasets, it outperforms commonly used alignment

methods. Moreover, we show that SMAI improves

various downstream analyses such as identification of

differentially expressed genes and imputation of single-cell spatial transcriptomics, providing further biological insights. SMAI’s interpretability also enables

quantification and a deeper understanding of the

sources of technical confounders in single-cell data.

Joint work with Eric Sun, David Donoho, James

Zou.

Invited Session IS015: Experimental Design and

Big Data Subsampling

Optimal Designs for Order-of-Addition Two-Level

Factorial Experiments

Fasheng Sun

Northeast Normal University

Abstract: A new type of experiment, called the order-of-addition factorial experiment, has recently

received considerable attention in medicine science

and bioengineering. These experiments aim to simultaneously optimize the order of addition and dose

levels of drug components. In the experimental design

literature, the idea of dual-orthogonal arrays (DOAs)

was recently introduced for such experiments. However, constructing flexible DOAs is a challenging task.

In this paper, we propose a novel theory-guided search

method that efficiently identifies DOAs of any size (if

present). We also provide an algebraic construction

that instantly leads to certain DOAs. Moreover, to

address the potential issue that DOA ignores interaction effects, we propose to construct a new type of

optimal designs under the expanded compound model,

named the strong DOA (SDOA). We provide two

algebraic constructions of the SDOA. We establish

theoretical results on the optimality of both DOAs and

SDOAs. Simulation studies are performed to demonstrate the superiority of our proposed designs.

Joint work with Qiang Zhao, Qian Xiao, Abhyuday

Mandal.

An Improved K-farthest Neighbor Detection

Methodology for Covariate Shift

Yu Tang

Soochow University

Abstract: In supervised learning, there is often a discrepancy between the data distribution during training

(source distribution) and the data distribution when

the model is used for testing (target distribution). We

aim to achieve a sensitive response from the software

system with minimal testing samples, even in the

presence of subtle covariate shift. To address covariate

shift in multivariate two-sample testing, this paper

proposes a novel Half K-Farthest Neighbor

(Half-KFN) test that demonstrates superior sensitivity

compared to traditional K-Nearest Neighbor (KNN)

test when detecting data distributions with small sample sizes and small magnitude shifts. The underling

idea comes from the fact that for a small number of

proportional shift samples, they are likely to shift to

other distributions, and the farthest neighbors with far

distance can better describe this shift. Furthermore, it

effectively controls Type I error rate under null hypothesis. To evaluate our proposed methodology, numerical experiments are conducted on two open data

sets, namely MNIST and CIFAR-10. Various shift

forms are preset in the test data, with different sample

sizes and shift ratios. A comparative analysis of different multivariate two-sample testing methods is

performed. The results demonstrate that our proposed

Half-KFN algorithm consistently exhibits superior

sensitivity across various scenarios.

Joint work with Bingbing Wang.

Model-Free Subsampling Method for Massive Data

Based on Uniform Designs

Yongdao Zhou

Nankai University

Abstract: Subsampling or subdata selection is a useful approach in large-scale statistical learning. Most

existing studies focus on model-based subsampling

methods which significantly depend on the model

assumption. In this paper, we consider the model-free

subsampling strategy for generating subdata from the

original full data. In order to measure the goodness of

representation of a subdata with respect to the original

第26页

17

data, we propose a criterion, generalized empirical

F-discrepancy (GEFD), and study its theoretical

properties in connection with the classical generalized

ℓ2 -discrepancy in the theory of uniform designs.

These properties allow us to develop a kind of

low-GEFD data-driven subsampling method based on

the existing uniform designs. By simulation examples

and a real case study, we show that the proposed subsampling method is superior to the random sampling

method. Moreover, our method keeps robust under

diverse model specifications while other popular

model-based subsampling methods are under-performing. In practice, such a model-free property is more appealing than the model-based subsampling methods, where the latter may have poor performance when the model is misspecified, as demonstrated in our simulation studies. In addition, our

method is orders of magnitude faster than other model-free subsampling methods, which makes it more

applicable for subsampling of big data.

Joint work with Mei Zhang, Zheng Zhou, Aijun

Zhang.

Focus Subsampling: A More Efficient Subsampling

Method for Large-scale Linear Classification

Jun Yu

Beijing Institute of Technology

Abstract: Subsampling is one of the popular methods

to balance statistical efficiency and computational

efficiency in the big data era. Most of these aim at

selecting informative or representative sample points

to achieve good overall information of the full data.

Examples include OSMAC, IBOSS, OSS, leverage-score subsampling, and low-GEFD subsampling,

along with many suitable variations. The present talk

takes the view that sampling techniques are recommended for the region we focus on and summary

measures are enough to collect the information for the

rest according to a well-designed data partitioning. We

propose a focus subsampling strategy that combines

the summary measures and selected subdata points.

We will show that the proposed method will lead to a

more efficient estimation for general large-scale linear

classification problems. We investigate and discuss

some properties of the method, establish some connections to the OSMAC subsampling method, and

illustrate its use via a real-world example.

Joint work with Haolin Chen.

Special Session SS2: Bernoulli Session on Stochastic Methods for Data Science

Multilevel Particle Filters for Partially Observed

McKean-Vlasov Stochastic Differential Equations

Ajay Jasra

The Chinese University of Hong Kong (Shenzhen)

Abstract: In this talk we consider the filtering problem associated to partially observed McKean- Vlasov

stochastic differential equations (SDEs). The model

consists of data that are observed at regular and discrete times and the objective is to compute the conditional expectation of (functionals) of the solutions of

the SDE at the current time. This problem, even the

ordinary SDE case is challenging and requires numerical approximations. We develop a new particle filter

(PF) and multilevel particle filter (MLPF) to approximate the afore-mentioned expectations. We prove

under assumptions that, for ε > 0, to obtain a mean

square error of O(ε

2

) the PF has a cost

per-observation time of O(ε

−5

) and the MLPF costs

O(ε

−4

) (best case) or O(ε

−5

)log(ε

2

) (worst case).

Our theoretical results are supported by numerical

experiments.

Bayesian Fixed-Domain Asymptotics for Covariance Parameters in Spatial Gaussian Process Models

Cheng Li

National University of Singapore

Abstract: Gaussian process models typically contain

finite dimensional parameters in the covariance function that need to be estimated from the data. We study

the Bayesian fixed-domain asymptotics for the covariance parameters in spatial Gaussian process regression models with an isotropic Matern covariance

function, which has many applications in spatial statistics. For the model without nugget, we show that

when the dimension of the domain is less than or

equal to three, the microergodic parameter and the

第27页

18

range parameter are asymptotically independent in the

posterior. While the posterior of the microergodic

parameter is asymptotically close in total variation

distance to a normal distribution with shrinking variance, the posterior distribution of the range parameter

does not converge to any point mass distribution in

general. For the model with nugget, we derive new

evidence lower bound and consistent higher-order

quadratic variation estimators, which lead to explicit

posterior contraction rates for both the microergodic

parameter and the nugget parameter. We further study

the asymptotic efficiency and convergence rates of

Bayesian kriging prediction. All the new theoretical

results are verified in numerical experiments and real

data analysis.

Bootstrap-Assisted Inference for Weakly Stationary Time Series

Yunyi Zhang

The Chinese University of Hong Kong

Abstract: The literature often adopts two types of

stationarity assumptions in the analysis of time series,

i.e., the weak stationarity, suggesting that the mean

and the autocovariance function of a time series are

time invariant; and strict stationarity, indicating that

the marginal distributions of the time series are time

invariant. While the strict stationarity assumption is

vital from theoretical aspect, it is hard to verify in

practice. On the other hand, the weak stationarity is

relatively feasible to ensure and verify, as it only relies

on the second-order structures of the time series.

Concerning this, while sorts of weak stationarity assumptions are typically adopted in time series modeling, statisticians may want to avoid relying on strict

stationarity assumptions during statistical inference.

This presentation focuses on the analysis of

quadratic forms within a weakly, but not necessarily

strictly stationary (vector) time series. In the context

of scalar time series, it establishes the Gaussian approximation for quadratic forms of a short-range dependent weakly stationary scalar time series. Building

upon this result, it derives the asymptotic distributions

of the sample autocovariances, the sample autocorrelations, and the sample autoregre- ssive coefficients.

Transitioning to vector time series, this presentation

tackles statistical inference within high-dimensional

vector autoregressive models with white noise innovations. Given the complicated covariance structures

inherent in non- stationary time series, this presentation adopts the dependent wild bootstrap method to

facilitate statistical inference. Numerical results verifies the consistency of the proposed theories and

methods.

Strict stationarity is hard to ensure and verify for

a real-life dataset. Therefore, our work should be able

to assist statisticians in capturing the inherent

non-stationarity of real-life time series.

Joint work with Efstathios Paparoditis and Dimitris

N. Politis.

Invited Session IS090: Intersection Research of

Statistics and Computer Science (统计学与计算机

科学的交叉研究)

Optimal One-Pass Nonparametric Estimation under Memory Constraint

Zhenhua Lin

National University of Singapore

Abstract: For nonparametric regression in the

streaming setting, where data constantly flow in and

require real-time analysis, a main challenge is that

data are cleared from the computer system once processed due to limited computer memory and storage.

We tackle the challenge by proposing a novel

one-pass estimator based on penalized orthogonal

basis expansions and developing a general framework

to study the interplay between statistical efficiency

and memory consumption of estimators. We show that,

the proposed estimator is statistically optimal under

memory constraint, and has asymptotically minimal

memory footprints among all one-pass estimators of

the same estimation quality. Numerical studies

demonstrate that the proposed one-pass estimator is

nearly as efficient as its nonstreaming counterpart that

has access to all historical data.

Joint work with Mingxue Quan.

Evaluating Dynamic Conditional Quantile Treatment Effects with Applications in Ridesharing

第28页

19

Ting Li

Shanghai University of Finance and Economics

Abstract: Many modern tech companies, such as

Google, Uber, and Didi, utilize online experiments

(also known as A/B testing) to evaluate new policies

against existing ones. While most studies concentrate

on average treatment effects, situations with skewed

and heavy-tailed outcome distributions may benefit

from alternative criteria, such as quantiles. However,

assessing dynamic quantile treatment effects (QTE)

remains a challenge, particularly when dealing with

data from ride-sourcing platforms that involve sequential decision-making across time and space. In

this paper, we establish a formal framework to calculate QTE conditional on characteristics independent of

the treatment. Under specific model assumptions, we

demonstrate that the dynamic conditional QTE

(CQTE) equals the sum of individual CQTEs across

time, even though the conditional quantile of cumulative rewards may not necessarily equate to the sum of

conditional quantiles of individual rewards. This crucial insight significantly streamlines the estimation

and inference processes for our target causal estimand.

We then introduce two varying coefficient decision

process (VCDP) models and devise an innovative

method to test the dynamic CQTE. Moreover, we

expand our approach to accommodate data from spatiotemporal dependent experiments and examine both

conditional quantile direct and indirect effects. To

showcase the practical utility of our method, we apply

it to three real-world datasets from a ride-sourcing

platform. Theoretical findings and comprehensive

simulation studies further substantiate our proposal.

Joint work with Chengchun Shi, Zhaohua Lu, Yi Li,

Hongtu Zhu.

Inverse Constrained Reinforcement Learning:

From Theory to Practice

Guiliang Liu

The Chinese University of Hong Kong (Shenzhen)

Abstract: In recent years, Reinforcement Learning

(RL) has achieved remarkable performance in some

tasks, receiving widespread attention from academia

and industry. However, successful applications of RL

in tasks that can bring significant value to society

(such as autonomous driving, medical diagnosis, and

robot control) are still relatively limited. The main

reason for this is the difficulty in ensuring the safety

of the control policies. To ensure the reliability of RL

algorithms in critical applications, agents must understand their constraints. However, in many real-world

tasks, considering that constraints can change over

time and scenarios, and are highly related to the inherent experience of human experts, the optimal constraints are often difficult to specify with prior

knowledge and formulas accurately. To address these

challenges, we propose Inverse Constrained Reinforcement Learning (ICRL), which aims to learn the

constraints followed by experts from their demonstration data. This helps determine the constraints in different scenarios, allowing the imitating agents to

achieve performance like human experts. Compared

to static, artificially designed constraints, constraints

learned through data-driven methods can generalize

more effectively across multiple environments, provide a more comprehensive explanation for expert

behavior, and promote the safety of downstream applications. In this report, I will discuss the latest progress in inverse constraint reinforcement learning

research, delve into theoretical algorithmic achievements and applications, and introduce how to combine

inverse constraint reinforcement learning with large

decision-making models and human feedback to obtain policies with better generalization under evolving

constraints.

Reinforcement Learning for Precision Medicine in

HIV

Yanxun Xu

Johns Hopkins University

Abstract: The use of antiretroviral therapy (ART) has

significantly reduced HIV-related mortality and morbidity, transforming HIV infection to a chronic disease

with the care now focusing on treatment adherence,

comorbidities including mental health, and other

long-term outcomes. Since combination ART with

three or more drugs of different mechanisms or

against different targets is recommended for all people

第29页

20

living with HIV (PWH) and they must continue on it

indefinitely once started, understanding the long-term

ART effects on health outcomes and personalizing

ART treatment based on individuals’ characteristics is

crucial for optimizing PWH’s health outcomes and

facilitating precision medicine in HIV. In this talk, I

will present reinforcement learning (RL) methods

designed to learn and understand the impact of ART

on the health outcomes of PWH, and explore the future of HIV care through innovative and individualized approaches.

Invited Session IS028: Interface Between Statistics

and Neuro and Cognitive Science

Impact of Zealots on Cooperation: A Study Based

on the Behavioral Experiments of One-Shot Prisoner’s Dilemma Games

Lei Shi

Yunnan University of Finance and Economics

Abstract: The emergence, maintenance, and evolution of cooperative behavior in selfish groups has long

been an important research question in the fields of

natural and social sciences. To investigate whether

zealots can enhance and stabilize cooperation levels in

social dilemma experiments, we utilize a mixed design by cruising human interact with machine (Zealots)

to study the conditions of promoting cooperation and

associated evolutionary mechanism. The between-subjects variable be the strength of the dilemma

in the prisoner's dilemma game, and the within-subjects variables involve incorporating zealots or

not. Participants were randomly assigned to the

one-shot and anonymous prisoner's dilemma game

with either the high dilemma strength or low dilemma

strength. For each game, participants will experience

three conditions: a control condition where they are

paired with other participants, a treatment condition

where they are paired with zealots but are unaware of

this information, and a treatment condition where they

are paired with zealots and are aware of this information. In order to counterbalance the order effects,

we employ a Latin Square method for our within-subjects experiments. We found that zealots are

indeed able to increase cooperation among players, we

also evidenced that there is a minimum number of

zealots needed to promote cooperation, finally we

found there is a bottleneck effect in enhancing cooperation.

Lifespan Connectome Growth Modeling

Yong He

Beijing Normal University

Abstract: The emergence, development, and aging of

the connectome architecture enable the dynamic reorganization of network specialization and integration

throughout the lifespan, contributing to continuous

changes in human cognition and behavior. Understanding the spatiotemporal growth process of the

typical connectome is critical for elucidating network-level developmental principles in healthy individuals and for pinpointing periods of heightened

vulnerability or potential. In this talk, I will present

our recent work on lifespan normative growth modeling of the human connectome derived from multimodal MRI data from 33,250 individuals aged 32

postmenstrual weeks to 80 years from 132 global sites.

Furthermore, I will demonstrate how

these connectome-based normative models can be

employed to identify individual heterogeneities

in brain network phenotypes in patients with neurological or psychiatric disorders, including autism

spectrum disorder, major depressive disorder, and

Alzheimer's disease. It is anticipated that the connectome-based growth modeling will assist in elucidating the lifespan evolution of the brain networks and

serve as a normative reference for quantifying individual variation in development, aging, and neuropsychiatric disorders.

Learning Network-Structured Dependence from

Non-Stationary Multivariate Point Process Data

Chunming Zhang

University of Wisconsin-Madison

Abstract: Understanding sparse network dependencies among nodes from multivariate point process

data has broad applications in information transmission, social science, and computational neuroscience.

This paper introduces new continuous- time stochastic

第30页

21

models for conditional intensity processes, revealing

network structures within non-stationary multivariate

counting processes. Our model's stochastic mechanism is crucial for inferring graph parameters relevant

to structure recovery, distinct from commonly used

processes like the Poisson, Hawkes, queuing, and

piecewise deterministic Markov processes. This leads

to proposing a novel marked point process for intensity discontinuities. We derive concise representations

of their conditional distributions and demonstrate

cyclicity of the counting processes driven by recurrence time points. These theoretical properties enable

us to establish statistical consistency and convergence

properties for proposed penalized M-estimators in

graph parameters under mild regularity conditions.

Simulation evaluations showcase the method's computational simplicity and improved estimation accuracy compared to existing approaches. Real neuron

spike train recordings are analyzed to infer connectivity in neuronal networks.

Debiased Estimation and Inference for Spatial-Temporal EEG/MEG Source Imaging

Peifeng Tong

Peking University

Abstract: The development of accurate electroencephalography (EEG) and magnetoencephalography

(MEG) source imaging algorithm is of great importance for functional brain research and

non-invasive presurgical evaluation of epilepsy. In

practice, the challenge arises from the fact that the

number of measurement channels is far less than the

number of candidate source locations, rendering the

inverse problem ill-posed. A widely used approach is

to introduce a regularization term into the objective

function, which inevitably biased the estimated amplitudes towards zero, leading to an inaccurate estimation of the estimator's variance. This study proposes a

novel debiased EEG/MEG source imaging (DeESI)

algorithm for detecting sparse brain activities, which

corrects the estimation bias in signal amplitude, dipole

orientation and depth. The DeESI extends the idea of

group Lasso by incorporating both the matrix Frobenius norm and the L1-norm, which guarantees the

estimators are only sparse over sources while maintains smoothness in time and orientation. We also

derived variance of the debiased estimators for standardization and hypothesis testing. A fast alternating

direction method of multipliers (ADMM) algorithm is

proposed for solving the matrix form optimization

problem directly without the need for vectorization.

The proposed algorithm is compared with nine existing ESI methods using simulations and an open source

EEG dataset whose stimulation locations are known

precisely. The DeESI exhibits the best performance in

peak localization and amplitude reconstruction.

Joint work with Haoran Yang, Xinru Ding, Yuchuan

Ding, Xiaokun Geng, Shan An, Guoxin Wang, Song

Xi Chen.

Contributed Session CS003: Recent Advances in

Mixture Model

A Gaussian Mixture Model for Multiple Instance

Learning with Partially Subsampled Instances

Baichen Yu

Peking University

Abstract: Multiple instance learning is a powerful

machine learning technique, which is found useful

when numerous instances can be naturally grouped

into different bags. Accordingly, a bag-level label can

be created for each bag according to whether the instances contained in the bag are all negative or not.

Thereafter, how to train a statistical model with

bag-level labels with/without partially labeled instances becomes the problem of great interest. To this

end, we develop a Gaussian mixture model (GMM)

framework to describe the stochastic behavior of the

instance-level feature vectors. Both the instance-based

maximum likelihood estimator (IMLE) and the

bag-based maximum likelihood estimator (BMLE) are

theoretically investigated. We found that the statistical

efficiency of the IMLE could be much better than that

of the BMLE, if the instance-level labels are relatively

hard to be predicted. To fix the problem, we develop

here a subsampling-based maximum likelihood estimation (SMLE) approach, where the instance-level

labels are partially provided through carefully subsampling. This leads to a significantly reduced label-

第31页

22

ing cost with little sacrifice in terms of statistical efficiency. To demonstrate the finite sample performance,

extensive simulation studies are presented. A real data

example using whole-slide images (WSIs) to diagnose

metastatic breast cancer is illustrated.

Joint work with Xuetong Li, Jing Zhou, Hansheng

Wang.

Semi-Implicit Variational Inference via Score

Matching

Longlin Yu

Peking University

Abstract: Semi-implicit variational inference (SIVI)

greatly enriches the expressiveness of variational families by considering implicit variational distributions

defined in a hierarchical manner. However, due to the

intractable densities of variational distributions, current SIVI approaches often use surrogate evidence

lower bounds (ELBOs) or employ expensive inner-loop MCMC runs for direct ELBO maximization

for training. In this paper, we propose SIVI-SM, a

new method for SIVI based on an alternative training

objective via score matching. Leveraging the hierarchical structure of semi-implicit variational families,

the score matching objective allows a minimax formulation where the intractable variational densities

can be naturally handled with denoising score matching. We show that SIVI-SM closely matches the accuracy of MCMC and outperforms ELBO-based SIVI

methods in a variety of Bayesian inference tasks.

Joint work with Cheng Zhang.

Estimating IRT Models under Gaussian Mixture

Modelling of Latent Traits: An Application of

MSAEM Algorithm

Siyao Cheng

Northeast Normal University

Abstract: The assumption of a normal distribution for

latent traits is a common practice in item response

theory (IRT) models. Numerous studies have demonstrated that this assumption is often inadequate, impacting the accuracy of statistical inferences in IRT

models. To mitigate this issue, Gaussian mixture

modeling (GMM) for latent traits, known as

GMM-IRT, has been proposed. Moreover, the

GMM-IRT models can also serve as powerful tools

for exploring the heterogeneity of latent traits. However, the computation of GMM-IRT model estimation

encounters several challenges, impeding its widespread application. The purpose of this paper is to

propose a reliable and robust computing method for

GMM-IRT model estimation. Specifically, we develop

a mixed stochastic approximation EM (MSAEM)

algorithm for estimating the three-parameter normal

ogive model with GMM for latent traits

(GMM-3PNO). Crucially, the GMM-3PNO is augmented to be a complete data model within the exponential family, thereby substantially streamlining the

computation of the MSAEM algorithm. Furthermore,

the MSAEM algorithm adeptly avoid the label-switching issue, ensuring its convergence. Finally,

simulation and empirical studies are conducted to

validate the performance of the MSAEM algorithm

and demonstrate the superiority of the GMM-IRT

models.

Joint work with Xiangbin Meng.

Gaussian Mixture Model with Rare Events

Xuetong Li

Peking University

Abstract: We study here a Gaussian Mixture Model

(GMM) with rare events data. In this case, the commonly used Expectation-Maximization (EM) algorithm exhibits extremely slow numerical convergence

rate. To theoretically understand this phenomenon, we

formulate the numerical convergence problem of the

EM algorithm with rare events data as a problem

about a contraction operator. Theoretical analysis

reveals that the spectral radius of the contraction operator in this case could be arbitrarily close to 1 asymptotically. This theoretical finding explains the

empirical slow numerical convergence of the EM

algorithm with rare events data. To overcome this

challenge, a Mixed EM (MEM) algorithm is developed, which utilizes the information provided by partially labeled data. We find that MEM algorithm significantly improves the numerical convergence rate as

compared with the standard EM algorithm. The finite

第32页

23

sample performance of the proposed method is illustrated by both simulation studies and a real-world

dataset of Swedish traffic signs.

Joint work with Jing Zhou, Hansheng Wang.

Mixed Models for Longitudinal Binary Outcomes

with Crossed Random Effects

Shi Zhang

University of Manitoba

Abstract: Longitudinal studies are important for understanding patterns of change and the effectiveness

of interventions. These studies sometimes involve

data collected from different levels, such as firms

being analyzed by multiple analysts and analysts

providing forecasts for multiple firms. This reflects

the interaction of factors across different levels. It is

crucial to use accurate statistical methods to analyze

such complex nested longitudinal data to ensure the

scientific validity of study results and conclusions.

This proposal aims to develop new statistical techniques specifically designed for analyzing longitudinal binary outcomes with crossed nested structures.

An appropriate analysis of this type of data should

consider random effects at all levels. In this thesis, we

include partially crossed random effects in mixed

models for longitudinal binary outcomes. We predict

the random effects using the orthodox best linear unbiased predictor method and obtain consistent estimators for the regression parameters. This method relies

only on the first and second moments of the random

effects, making it robust against distributional assumptions. We demonstrate the usefulness of our approach through simulation and application to US firms

to investigate factors from analysts and firms that are

linked to long-term growth forecasts for firms.

Joint work with Depeng Jiang.

Contributed Session CS004: Statistical Hypothesis

Testing in Complex Data

Hypothesis Testing in Gaussian Graphical Models:

Goodness-of-Fit and Conditional Randomization

Tests

Xiaotong Lin

National University of Singapore

Abstract: We introduce novel hypothesis testing

methods for Gaussian graphical models by generating

exchangeable copies. We utilize the copies to formulate a goodness-of-fit test, which is valid in both low

and high-dimensional settings and flexible in choosing

the test statistic. This test exhibits superior power

performance, especially in scenarios where the true

precision matrix violates the null hypothesis with

many small entries. Furthermore, we adapt the sampling algorithm for constructing a new conditional

randomization test for the conditional independence

between a response Y and a vector of covariates X

given some other variables Z without requirement of

any modeling assumption about Y. It also relaxes the

assumptions of conditional randomization tests by

allowing the number of unknown parameters of the

distribution of X to be much larger than the sample

size. For both of our testing procedures, we propose

several test statistics and conduct comprehensive simulation studies to demonstrate their superior performance in controlling the Type-I error and achieving

high power. The usefulness of our methods is further

demonstrated through real-world applications.

Joint work with Dongming Huang, Fangqiao Tian.

Robust Estimation and Testing for GARCH Models via Exponentially Tilted Empirical Likelihood

Yashuang Li

Yunnan University

Abstract: The GARCH model has become one of the

most powerful and widespread tools for dealing with

time series heteroskedastic models. A commonly employed approach for inference on GARCH models is

via the quasi-maximum likelihood. However, unless

the data are sampled regularly, the quasi-maximum

likelihood estimator is inconsistent due to density

misspecification or the presence of outliers. The main

aim of this paper is to present a robust nonparametric

likelihood analysis of GARCH models including estimation of the coefficient parameters as well as model specification testing of the GARCH process. A set

of identifying moment functions are specified by applying the idea of quantile regression models to the

GARCH process. Our moment restrictions not only

第33页

24

allow the GARCH innovations to be general distribution but also is less sensitive to outliers. We then explore the use of exponentially tilted empirical likelihood (ETEL) to effectively combine these quantile

related moment restrictions. The ETEL framework

allows for imposing over-identifying restrictions and

offers implied probabilities for efficient and robust

moment estimation and inference. Asymptotic properties of the resultant ETEL estimators and test statistics

are investigated under mild conditions on the innovation distributions. We illustrate and evaluate the proposed strategies through numerical experiments on

simulated and real datasets.

Joint work with Puying Zhao, Niansheng Tang.

A Bayesian Phase I/II Platform Design for Multiple

Indications with Mixed Types of Endpoints of Toxicity and Efficacy

Xian Shi

East China Normal University

Abstract: For a new targeted or immunotherapy agent,

studying phase I/II behavior by combining multiple

indications with cancer-specific standard of care becomes a new direction. In this article, we propose

Bayesian phase I/II platform design to co-develop

combination therapies in multiple indications with

binary toxicity endpoint and survival efficacy endpoint with a generic master-protocol in the evaluation

of each indication. Bayesian hierarchical models are

used to borrow information across indications for

more efficient indication-specific decision-making.

Sequential design for optimal biological dose finding

through utility function is provided. Simulation study

shows that the proposed design has desirable operating characteristics and is superior to design that uses

dichotomized efficacy endpoint.

Elevating Federated Clustering: Deep Generative

Models and Contrastive Learning Strategies

Jie Yan

Central University of Finance and Economics

Abstract: Federated clustering (FC) is an essential

extension of centralized clustering designed for the

federated setting, wherein the challenge lies in constructing a global similarity measure without sharing

private data. Conventional approaches to FC typically

adopt extensions of centralized methods, like

K-means and fuzzy c-means. However, these methods

are susceptible to non-independent-and-identicallydistributed (non-IID) data among clients, leading to

suboptimal performance, particularly with

high-dimensional data. To handle these, we first

bridged FC and deep generative models. By executing

an autoencoder-based deep clustering method on the

generated data, one can make the model immune to

the non-IID problem and significantly enhance its

performance, especially when dealing with

high-dimensional data. Nevertheless, this bridge still

manifests a substantial gap in clustering quality compared to state-of-the-art centralized clustering methods due to the inadequate representation learning capacity of autoencoder, and the generated data could

elevate the risk of privacy breaches. Then, we bridged

FC and contrastive models using the classic FedAvg

framework. By learning more clustering-friendly representations, the gap has been notably reduced in certain federated scenarios. However, our empirical and

theoretical analyses indicate that increased non-IID

level often accompanies increased correlations across

multiple dimensions of the learned representations,

leading to poor and unrobust performance. To address

this, we introduce a decorrelation regularizer, which

effectively mitigates the detrimental effects of the

non-IID problem, and achieves superior performance,

as evidenced by a marked increase in NMI scores,

with the gain reaching as high as 0.32 in the most

pronounced case. Moreover, these methodologies also

show superior performance in handling device failures

from a practical viewpoint.

Joint work with Jing Liu, Ji Qi, Yizi Ning, Zhongyuan Zhang.

The Lose-Lose or All-Lose Consequences: Assessing the International Economic Impact of Sino–U.S. Technological Decoupling

Yutao Jiang

Anhui University of Finance and Economics

Abstract: In response to the U.S. chip embargo, Chi-

第34页

25

na has proposed export controls on crucial materials

like gallium, germanium, and graphite. However, few

studies have explored the economic impacts of these

trade sanctions policies. This study addresses this gap

by examining theoretical mechanisms and constructing a global input-output database for the chip, gallium-germanium, and graphite sectors. Using a dynamic computable general equilibrium model, we quantitatively evaluate the economic impacts of the Sino–

U.S. technological competition and conduct robustness tests. The results show that in the most extreme

scenario of the chip embargo, the GDP of China, U.S.,

and the world decreases by 1.051%, 0.006%, and

0.201%, respectively; that of Japan, South Korea, and

Taiwan, which follow the U.S. in implementing chip

sanctions, decreases by 0.109%, 0.177%, and 0.330%,

respectively. China’s export controls on crucial raw

materials will reduce national economic damage and

have a large negative impact on Japan, South Korea,

and Taiwan, whereas the U.S. suffers relatively limited negative impacts. Our findings reveal that the

Sino–U.S. technological competition is unfavorable to

the economic interests of the two countries and poses

challenges to global economic recovery in the

post-pandemic era.

Joint work with Lianbiao Cui.

Contributed Session CS001: Recent Advances in

Reinforcement Learning

Strategy Evaluation in Non-Stationary Reinforcement Learning Environment

非平稳强化学习环境下的策略评估

Wei Wang

Shandong University

摘要:强化学习是当前机器学习的前沿热点方向之

一,已经广泛应用在医疗、经济等各个领域中。经

典的强化学习框架通常考虑平稳的或者分布不随

时间变化的决策环境。这不适用于现实场景下具有

非平稳单位根过程的环境,经典的强化学习方法往

往得到次优的策略。为此,本研究考虑非平稳单位

根环境下的马尔可夫决策过程,并提出了基于无模

型的 Q 学习算法和基于模型的最大似然函数估计

法。研究分析了算法和估计方法的相合性和有效性

,采用模拟验证了提出方法的性质,并应用在实际

问题中。

Joint work with Xiaodong Yan.

Reinforcement Learning in Interval-Censored

Data

Zhimiao Cao

Shandong University

Abstract: Reinforcement learning is a general technique enabling an agent to learn an optimal policy and

interact with an environment in sequential decision-making problems. Interval-censored data is a

common form of data encountered in practical data

analysis, where observed results are only known to lie

within certain intervals rather than exact values. The

characteristics of this data structure pose challenges to

traditional data analysis and decision-making, requiring appropriate strategies for handling and decision-making based on incomplete information. This

paper proposes a framework that applies reinforcement learning to interval-censored data processing to

develop an intelligent decision system capable of

offering personalized behavioral recommendations

based on observers' state and activity variables. By

combining reinforcement learning with interval-censored data, we can devise effective intervention strategies to address observers' emotional fluctuations and enhance the overall quality of their emotional states. Experimental results demonstrate that this

integrated approach effectively optimizes observers'

emotional states, providing a new method for personalized interventions and recommendations. This research is significant for the development of intelligent

and personalized emotion management systems, offering valuable insights for future health sciences and

intelligent decision systems.

Joint work with Xiaodong Yan, Chengchun Shi,

Jianqi Feng.

Unobserved Structural Changes in the Factor-Augmented Panel Quantile Model

Fanyu Meng

Shandong University

Abstract: Panel data models, known for their varied

structural patterns, have increasingly attracted atten-

第35页

26

tion in the fields of econometrics and statistics. This

paper addresses the issue of structural changes within

the factor-augmented panel quantile model. We introduce a unified method that detects the structural instabilities and sparsity in the model by employing a double penalized loss function. This approach guarantees

that the penalized estimators exhibit oracle properties,

providing adaptability without the necessity for stringent moment conditions on the errors. Our simulations

validate that the method is effective with finite samples, and its application to real-world data proves the

approach's capability to accurately detect structural

instability.

Joint work with Wei Wang, Xiaodong Yan, Xinbing

Kong.

Reinforcement Learning for Survival Analysis

Jianqi Feng

Shandong University

Abstract: Reinforcement learning aims to optimize

the mapping of states to actions in order to maximize

rewards. In survival analysis, determining the appropriate policy based on an individual's state is crucial.

To address this, we propose combining reinforcement

learning with survival analysis to achieve optimal

treatment outcomes. In order to accommodate the data

structure in survival analysis, we introduce a Markov

decision process that handles censored and recurrent

events. We propose a Q-function that utilizes reinforcement learning to determine the best treatment at

each step. To address the complexities associated with

multi-stage, censored and recurrent events, we redesign the data structure for single events and extend

finite data to infinite-stage data structures to accommodate reinforcement learning algorithms. By estimating the duration of recurrent events for all individuals at each stage, we maximize the probability of

exceeding a specified value as our target function in

reinforcement learning. Experimental results demonstrate that our proposed framework achieves and

maintains a high level of accuracy, even in the presence of right-censored data. This presents a novel

approach to decision-making in survival analysis.

Joint work with Wei Zhao, Chengchun Shi, Zhenke

Wu, Xiaodong Yan.

A Fast Optimal Hyperparameter Selection Based

on Bandits for Streaming Data

Zhang Yu

Shandong University

Abstract: In this paper, we propose an algorithm to

address the problem of efficiently selecting optimal

hyperparameters in continuous data streams. Specifically, we introduce an online algorithm named the

Online Hyperparameter Selection (OHS) based on the

multi-armed bandit framework. The implementation

of this algorithm requires only the availability of the

current data batch at each stage of the data stream,

without the need to observe the entire dataset. We

develop a dynamic procedure to select the optimal

hyperparameters at each arrival of a new data batch,

enabling adaptive adjustment of parameter values

along the data stream. Extensive numerical experiments are conducted to evaluate the performance of

OHS, covering a range of models including, but not

limited to, linear regression models, quantile regression models, and non-parametric models. These experiments demonstrate the effectiveness of our algorithm and are supported by theoretical results.

Joint work with Xiaodong Yan.

Contributed Session CS006: Interdisciplinary and

Applied Research: Statistical Analysis on Medical

Data and Models

Research on Convex Clustering for Multi-Source

Data

Jianxi Zhao

Beijing Information Science and Technology University

Abstract: In recent years, convex clustering has attracted intensive attentions because it has basically

overcome the three shortcomings of traditional clustering methods: non-global convergence, poor robustness and the need for prior information. Nowadays, data of a large number of problems can be obtained from multiple sources. Therefore, in this paper,

I propose a convex clustering model for multiple

sources, give its theoretical recovery guarantee theo-

第36页

27

rem, present its solving algorithm and analyze the

convergence of the algorithm in theory. Numerical

experiments performed on several multi-source datasets show that the proposed method achieves the

better clustering performance, which is compared with

some state-of-the-art clustering methods.

Generalization Analysis of Deep CNNs under

Maximum Correntropy Criterion

Zhiying Fang

Shenzhen Polytechnic University

Abstract: Convolutional neural networks (CNNs)

have gained immense popularity in recent years, finding their utility in diverse fields such as image recognition, natural language processing, and

bio-informatics. Despite the remarkable progress

made in deep learning theory, most studies on CNNs,

especially in regression tasks, tend to heavily rely on

the least squares loss function. However, there are

situations where such learning algorithms may not

suffice, particularly in the presence of heavy-tailed

noises or outliers. This predicament emphasizes the

necessity of exploring alternative loss functions that

can handle such scenarios more effectively, thereby

unleashing the true potential of CNNs. In this paper,

we investigate the generalization error of deep CNNs

with the rectified linear unit (ReLU) activation function for robust regression problems within an information theoretic learning framework. Our study

demonstrates that when the regression function exhibits an additive ridge structure and the noise possesses

a finite pth moment, the empirical risk minimization

scheme, generated by the maximum correntropy criterion and deep CNNs, achieves fast convergence rates.

Notably, these rates align with the mini-max optimal

convergence rates attained by fully connected neural

network model with the Huber loss function up to a

logarithmic factor. Additionally, we further establish

the convergence rates of deep CNNs under the maximum correntropy criterion when the regression function resides in a Sobolev space on the sphere.

Construction and Validation of an Imaging Omics

Prediction Model for the Efficacy of Methylprednisolone in the Treatment of Radiation-Induced

Brain Injury

甲基强的松龙治疗放射性脑损伤疗效的影像组学

预测模型构建与验证

Xiaohuang Zhuo

Tianjin Huanhu Hospital

摘要:目的:静脉输注甲基强的松龙是鼻咽癌后放

射性脑损伤的主要治疗方法。然而,一些患者未能

从甲基强的松龙中受益甚至病情可能会加重。因此

,本研究的目的是建立影像组学模型来预测甲基强

的松龙对放射性脑损伤患者的治疗效果。研究对象

和方法:本研究纳入 66 例接受甲基强的松龙治疗

的放射性脑损伤患者。所有患者在激素治疗前后均

接受头颅磁共振成像(MRI)。每个放射性脑损伤病

人治疗前的 MRI 图像可提取出 961 个影像特征。

然后应用 LASSO 回归分析挑选出跟激素疗效相关

的影像特征来构建影像组学特征分类器,同时结合

激素疗效的临床预测因素,用多因素 Logistic 回归

分析建立临床影像组学预测模型,并对模型的区分

度、校准度和临床应用性进行评估。同时使用 10

倍交叉验证对模型进行内部验证。结果:由 16 个

筛选出的影像特征组成的影像组学特征分类器在

整个数据集和不同的亚组中都取得了良好的预测

效果。结合了影像组学特征分类器和放疗后至放射

性脑损伤诊断之间时间间隔的预测模型结果显示

有良好的区分度,其 AUC 值和通过 10 倍交叉验证

校正的 AUC 值分别为 0.966 和 0.967。校准曲线也

显示该模型有较好的一致性。决策曲线分析表明,

该影像组学预测模型具有一定的临床应用价值。结

论:本研究提出结合影像组学特征和放疗后到放射

性脑损伤诊断之间时间间隔的影像组学预测模型,

该模型可以方便地用于提前预测静脉输注甲基强

的松龙对放射性脑损伤患者的治疗效果。

A Copula-Based Approach on Optimal Allocation

of Hot Standbys in Series Systems

Jiandong Zhang

Northwest Normal University

Abstract: In this talk, we propose a copula-based

approach to study the allocation problem of hot

standbys in series systems composed of two heterogeneous and dependent components. By assuming that

the lifetimes of components and spares are dependent

and linked via a general survival copula, optimal al-

第37页

28

location strategies are presented for the case of one

and two redundancies at the component level. Further,

redundancies allocation mechanisms are also compared between the allocations at the component level

and the system level. For the case of one hot standby,

we find that the performance of the redundant system

at the component level is always worse than that at the

system level. For the case of two hot standbys, the

reversed allocation principle (i.e., Barlow–Proschan

principle) is valid. Numerical examples and applications are also provided as illustrations. A real application on improving tensile strength of cables in high

voltage electricity transmission network systems is

presented for showing the applicability of our results.

Joint work with Yiying Zhang and Rongfang Yan.

A Monte Carlo Classification Method Based on

Partial Variables and Its Application in Clinical

Data

Shanjun Mao

Hunan University

Abstract: In some specific classification problems,

the input features X are divided into two parts: one

part is X1 that can be measured before constructing

the model, and the other part is X2 that cannot be

measured before constructing the model. In these

specific problems, the input features often cannot be

all measured before constructing the classification

model, and modeling using only the features that can

be observed will lose classification information. To

address this class of problems, this paper proposes a

Monte Carlo classification method based on partial

variables. The method models the sufficient dimension reduction of X2 at y = 0 and y = 1 by learning the

relationship between variable X1 and variable X2 in

the training set, and obtains the sufficient dimension

reduction result R(X2). Then, the Monte Carlo Markov chain method is used to build a sampling model

of R(X2), and finally a classification model is built

using X1 and R(X2) as input features. By sampling

out the intraoperative data after sufficient dimensionality reduction when new samples appear, thus being

able to take into account both preoperative and intraoperative feature information when classifying, the

method is able to predict more accurately the category

to which the samples belong.

Joint work with Yao Cui, LingYi Hu.

July 12, 16:00-17:40

Invited Session IS072: Statistical Interdisciplinary

Studies I

Solving Large-Scale Sparse Equations with Tree

Structures and Its Applications to Optical Fiber

Networks

Bingyi Jing

Southern University of Science and Technology

Abstract: In communication networks, detecting

asymmetric links is of significant practical importance

and has been a long-standing problem in industry.

From a statistical perspective, this problem can be

transformed into that of solving large-scale sparse

equations with tree structures. In this talk, we will

discuss how to embed sparsity into this problem and

provide effective and reliable solutions. Finally, we

will demonstrate the effectiveness of this approach in

finding asymmetric links in optical fiber networks. In

particular, we show that our proposed approach can

achieve an accuracy rate of 100% in realistic settings

and the method has already been deployed in the industry.

Integrating Statistical Learning and Deep Learning for Efficient and Interpretable Analysis of

Complex Unstructured Data

Ke Deng

Tsinghua University

Abstract: The great success of large deep learning

models in various applications in recent years have

encouraged many researchers to seek improved performance by utilizing larger models and bigger data in

practical problems involving unstructured data, leading to increasingly obvious psychological implications

to pursuit large models everywhere. However, the

fundamental principle of statistical modelling tells us

that an over-flexible large model without a clear focus

on unique features of the problem of interest would

often lead to inefficient utilization of data and

sub-optimal results. In this talk, we will provide con-

第38页

29

crete examples, in context of video analysis, that deep

learning can be greatly enhanced by statical learning

once we integrate them wisely. We hope these examples could inspire more research efforts on developing

advanced statistical approaches with competitive performance and transparent interpretation for analyzing

complex unstructured data on top of deep learning.

Joint work with Haifeng Wang.

Exploring Novel Uncertainty Quantification

through Forward Intensity Function Modeling

Cheng Yong Tang

Temple University

Abstract: Predicting future time-to-event events is a

foundational task in statistical learning. While various

methods exist for generating point predictions, quantifying the associated uncertainties poses a more substantial challenge. In this study, we introduce an innovative approach specifically designed to address

this challenge, accommodating dynamic predictors

that may manifest as stochastic processes. Our investigation harnesses the forward intensity function in a

novel way, providing a fresh perspective on this intricate problem. The framework we propose demonstrates remarkable computational efficiency, enabling

efficient analyses of large-scale investigations. We

validate its soundness with theoretical guarantees, and

our in-depth analysis establishes the weak convergence of function-valued parameter estimations. We

illustrate the effectiveness of our framework with two

comprehensive real examples and extensive simulation studies.

Dynamic Synthetic Control Method for Semiparametric Time-Varying Additive Autoregression

Model

Shouxia Wang

Peking University

Abstract: Motivated by evaluating the treatment effects of a policy for nonlinear time-varying confounding variables, we propose a dynamic synthetic

control (DSC) method under the semiparametric

time-varying additive autoregression outcome model.

The proposed method allows for micro-level data with

nonlinear time-varying confounders, multiple treated

units and spatial correlations in the data.

Spline-back-fitted-kernel estimation method is used to

obtain good estimations of the unknown additive

functions, which are then used for matching when we

construct the DSC weights. The DSC weights are

constructed by the empirical likelihood, which guarantees a unique solution and a consistent estimation of

the average treatment effect on the treated group.

Semiparametric additive model provides more flexibility in modelling and estimation, making it more

favorable when either the parametric form of the

model is unknown or the model is incorrectly specified. We have developed an unconfounded assumption

assessment test based on the estimated effects in the

pre-treatment period and a normalized placebo test to

determine the significance of the estimated treatment

effects. The proposed DSC method is demonstrated by

both numerical simulations and real data examples

that highlight the effects of the air pollution alerts in

Beijing and the COVID-19 lockdown in Shanghai.

Joint work with Song Xi Chen, Xiangyu Zheng.

Invited Session IS050: Recent Advances in Functional and Complex Data

Functional Principal Component Analysis of Spatially and Temporally Indexed Point Processes

Yehua Li

University of California, Riverside

Abstract: We model spatially and temporally indexed

point process data as a multi-level log-Gaussian Cox

process where the log intensity function depends on a

partially linear single-index structure of spatio-temporal covariates and three latent functional

random effects representing the spatial and temporal

random effects as well as their interactions. We assume that the latent functional effects are Gaussian

processes with Karhunen-Loeve representations, and

model the unknown link function of the single-index

as well as the covariance functions of the latent functional effects as splines. We propose to estimate the

partially linear coefficients and the single-index link

function using a Poisson maximum likelihood method,

and the covariance functions of the latent processes

第39页

30

using maximum composite likelihood methods. We

also propose approaches to predict the functional

principal component scores. Under the multi-level

dependence structure and allowing the spatio-temporal covariates to be non-stationary, the proposed estimators follow rather unconventional convergence rates which depend on both the number of

locations and the number of repeated measures in time.

We illustrate the proposed method through a simulation study and a real-data application in modeling

bike-sharing events.

Joint work with Kun Huang, Yongtao Guan.

Functional Neural Networks

Jiguo Cao

Simon Fraser University

Abstract: Functional data analysis (FDA) is a growing statistical field for analyzing curves, images, or

any multidimensional functions, in which each random function is treated as a sample element. Functional data is found commonly in many applications

such as longitudinal studies and brain imaging. In this

talk, I will present a methodology for integrating

functional data into deep neural networks. The model

is defined for scalar responses with multiple functional and scalar covariates. A by-product of the method is

a set of dynamic functional weights that can be visualized during the optimization process. This visualization leads to greater interpretability of the relationship

between the covariates and the response relative to

conventional neural networks. The model is shown to

perform well in a number of contexts including prediction of new data and recovery of the true underlying relationship between the functional covariate and

scalar response; these results were confirmed through

real data applications and simulation studies.

Causal Mediation Analysis for Multilevel and

Functional Data

Xi Luo

The University of Texas Health Science Center at

Houston

Abstract: Causal mediation analysis typically involves conditions that may not be applicable in neuroimaging studies. We introduce a multilevel causal

mediation framework to overcome this limitation and

more accurately quantify information flow in brain

pathways. This framework is designed to tackle several challenges: unmeasured mediator-outcome confounding, multilevel time series analysis, and the estimation of functional causal effects. Our approach is

grounded in multilevel structural equation modeling,

complemented by relaxed likelihood estimation

methods. Interestingly, certain causal estimates, typically unobtainable in simpler data structures, become

identifiable in our more complex data setting. We

provide proof of the asymptotic properties of our estimators and illustrate the numerical properties

through empirical analysis. Additionally, we utilize

real fMRI data to demonstrate the practical effectiveness of our proposed framework.

Joint work with Yi Zhao, Michael Sobel, Martin

Lindquist, Brian Caffo.

Frequent-Voting Independence Screening for Data

of Different Types or Different Dimensions

Kehui Chen

University of Pittsburgh

Abstract: Modern datasets often include different

types of variables with complex features, making

variable selection particularly challenging. For example, a measure of dependence with the response variable may not be directly comparable among predictor

variables of different types such as functional data. To

address this challenge, this work proposes a frequent-voting based independent screening method for

variable selection, which avoids a direct comparison

of the dependence measure among different variables.

Asymptotic analyses show that the proposed method

selects all of the active variables with probability

converging to one. We also demonstrate its great finite

sample performance through numerical experiments

and the application to an ADHD study.

Joint work with Haeun Moon.

Invited Session IS071: Statistical Inference for

High-Dimensional Data

Some Recent Results for P-Value Free FDR Con-

第40页

31

trols

Jun Liu

Harvard University

Abstract: There has been significant interest among

researchers in false discovery rate (FDR) control

methods partially due to the strong desire from the

scientific community for reproducibility and replicability of scientific discoveries. I will discuss our recent efforts trying to go beyond the recently popular

p-value-free FDR control methods such as the

knockoff filter (KF), data splitting (DS), and Gaussian

mirror (GM). We present some power analysis of

these methods under the weak-and-rare signal framework and discuss its implications under different correlation structures of the design matrix. We then focus

on the DS procedure and its variant. In particular, we

reformulate the DS method into a two-step procedure:

using part of the data for estimation and feature ranking (in regression setting) and using the other part as

checking/validation. FDR control can be achieved by

monitoring how well the validation goes along the

feature ranking. Under this setup, we may utilize external information and apply any procedure, such as a

Bayesian method with spike-and-slab priors, to work

on the first part of the data. We show that substantial

power gain can be achieved in this way.

Joint work with Buyu Lin, Tracy Ke, Yuanchuan

Guo.

Enhancing Integrative Association Tests: Optimal

Weighting Approaches in Whole-Genome Sequencing Studies

Zheyang Wu

Worcester Polytechnic Institute

Abstract: Integrative association tests are a crucial

method for detecting association signals and have

broad applications. Weighting is an important strategy

for incorporating useful information to increase statistical power. For example, in genetic association studies using whole-genome sequencing (WGS) data, SNP

allele frequencies and annotations are considered indicative of the likelihood and effect size of genetic

causal variants. Consequently, they are widely utilized

in weighted integrative association tests to enhance

the identification of novel genes associated with human complex traits. However, the rationale for their

use is mostly based on biological motivations. In this

study, we reveal the statistical mechanisms by which

weighting contributes to increased power, deduce the

optimal weights based on signal and data correlation,

and discuss the advantages and limitations of

weighting. In particular, we establish the asymptotically optimal weights for a general framework of

weighted p-value combination tests, which include

prevalent methods used in genetic association studies.

We also explore the principles for estimating optimal

weights in practice. Our findings are validated through

extensive simulations and real data analysis.

Joint work with Hong Zhang, Ming Liu, John

Landers.

Detection and Statistical Inference on Informative

Core and Periphery Structures in Weighted Directed Networks

Wen Zhou

New York University

Abstract: In network analysis, noises and biases,

which are often introduced by peripheral or

non-essential components, can mask pivotal structures

and hinder the efficacy of many network modeling

and inference procedures. Recognizing this, identification of the core-periphery (CP) structure has

emerged as a crucial data pre-processing step. While

the identification of the CP structure has been instrumental in pinpointing core structures within networks,

its application to directed weighted networks has been

underexplored. Many existing efforts either fail to

account for the directionality or lack the theoretical

justification of the identification procedure. In this

work, we seek answers to three pressing questions: (i)

How to distinguish the informative and

non-informative structures in weighted directed networks? (ii) What approach offers computational efficiency in discerning these components? (iii) Upon the

detection of CP structure, can uncertainty be quantified to evaluate the detection? We adopt the signal-plus-noise model, categorizing uniform relational

patterns as non-informative, by which we define the

第41页

32

sender and receiver peripheries. Furthermore, instead

of confining the core component to a specific structure,

we consider it complementary to either the sender or

receiver peripheries. Based on our definitions on the

sender and receiver peripheries, we propose spectral

algorithms to identify the CP structure in directed

weighted networks. Our algorithm stands out with

statistical guarantees, ensuring the identification of

sender and receiver peripheries with overwhelmingly

probability. Additionally, our methods scale effectively for expansive directed networks. Implementing our

methodology on faculty hiring network data revealed

captivating insights into the informative structures and

distinctions between informative and non-informative

sender/receiver nodes across various academic disciplines.

Invited Session IS006: AI and Machine Learning in

Single Cell Genomic

Integrating Transcriptomic and Pathomic Features

to Reconstruct 3D Tissue Maps with Super-Resolution

Mingyao Li

University of Pennsylvania

Abstract: Solid tissues form complex 3D structures,

and examining the tissue microenvironment in 3D

context allows researchers to gain a comprehensive

understanding of how cells interact within the original

tissue context. This 3D information also reveals spatial relationships between different cell types and

signaling pathways that are not observable in 2D tissue sections. In this talk, I will present our recently

developed tool that is aimed at generating single-cell

resolution 3D ST tissue maps while significantly reducing experimental costs. By integrating information

from spatial transcriptomics and pathology imaging

data, our method gradually increases gene expression

resolution down to the single-cell level. Additionally,

we have developed an algorithm to register tissue

sections obtained from serial tissue cuts and impute

missing gene expression data between tissue gaps,

enabling the construction of accurate 3D tissue volumes. The resulting analysis will not only generate a

single-cell resolution spatial transcriptomics tissue

map but also facilitate detailed characterization and

quantification of tissue structures of interest in 3D.

Joint work with Daiwei Zhang.

A Hybrid Approach for Selecting Highly Variable

Genes in Single-Cell RNA-Seq

Hongkai Ji

Johns Hopkins University

Abstract: Selecting highly variable genes (HVGs) or

features (HVFs) is a key component of many single

cell RNA-seq data analysis pipelines. Here we conduct a systematic benchmark study of 47 existing and

new HVG selection methods using 19 benchmark

datasets and an average of 18 evaluation criteria per

method. We found that a hybrid approach integrating

features from multiple methods robustly outperformed

existing individual methods, yielding more accurate

cell clustering, label transfer, and improved

cross-modality correlation. We developed an R package mixhvg that delivers this hybrid solution. Users

can conveniently use this package to perform HVG

selection independently or as part of their custom data

analysis pipelines.

Joint work with Ruzhang Zhao, Jiuyao Lu, Weiqiang

Zhou, Ni Zhao.

Supervised Deep Learning with Gene Annotation

for Cell Classification

Wei Sun

Fred Hutchinson Cancer Center

Abstract: Gene-by-gene differential expression analysis is a widely used supervised-learning method for

analyzing single-cell RNA sequencing (scRNA-seq)

data. However, due to the large number of cells in

scRNA-seq studies, such analysis can lead to many

differentially expressed genes with extremely small

p-values but minimal effect sizes, making interpretation challenging. To address this issue, we proposed

an alternative method called Supervised Deep Learning with Gene Annotation (SDAN). SDAN integrates

gene annotation and gene expression data using a

graph neural network, which identifies gene sets that

accurately classify cells. By using SDAN, we have

successfully identified gene sets associated with se-

第42页

33

vere COVID-19, Alzheimer's disease, and cancer

patients' response to immunotherapy.

Joint work with Zhexiao Lin.

Invited Session IS054: Recent Advances in Statistical Machine Learning

A Bayesian Framework for Leveraging Pretrained

Large Diffusion Models

Jian Huang

The Hong Kong Polytechnic University

Abstract: Diffusion-based generative models have

achieved remarkable successes in learning complex

probability measures for various types of data, including image, video, audio, and biomedical data.

Researchers have taken steps to leverage pre-trained

large-scale models with a significantly reduced

amount of data, enabling them to generate samples

that align with the dataset's support and achieve comparable quality. The combination of learnable modules

and large models has shown impressive generation

capabilities. Therefore, it is useful to understand how

we can leverage a large model for analyzing data from

\"a small probability space\" with a limited amount of

data. In this work, we formulate a Bayesian framework for leveraging large diffusion models in generative tasks. We clarify the meaning behind leveraging a

large model for analyzing data from a \"small probability space\" and explore the task of leveraging

pre-trained models using learnable modules from a

Bayesian perspective.

Joint work with Ding Huang, Ting Li.

Unsupervised Federated Learning: A Federated

Gradient EM Algorithm for Heterogeneous Mixture Models with Robustness Against Adversarial

Attacks

Yang Feng

New York University

Abstract: While supervised federated learning approaches have enjoyed significant success, the domain

of unsupervised federated learning remains relatively

underexplored. In this paper, we introduce a novel

federated gradient EM algorithm designed for the

unsupervised learning of mixture models with heterogeneous mixture proportions across tasks. We begin

with a comprehensive finite-sample theory that holds

for general mixture models, then apply this general

theory on Gaussian Mixture Models (GMMs) and

Mixture of Regressions (MoRs) to characterize the

explicit estimation error of model parameters and

mixture proportions. Our proposed federated gradient

EM algorithm demonstrates several key advantages:

adaptability to unknown task similarity, resilience

against adversarial attacks on a small fraction of data

sources, protection of local data privacy, and computational and communication efficiency.

Joint work with Ye Tian, Haolei Weng.

Value Enhancement of Reinforcement Learning

via Efficient and Robust Trust Region Optimization

Fan Zhou

Shanghai University of Finance and Economics

Abstract: Reinforcement learning (RL) is a powerful

machine learning technique that enables an intelligent

agent to learn an optimal policy that maximizes the

cumulative rewards in sequential decision making.

Most of methods in the existing literature are developed in online settings where the data are easy to collect or simulate. Motivated by high stake domains

such as mobile health studies with limited and

pre-collected data, in this article, we study offline

reinforcement learning methods. To efficiently use

these datasets for policy optimization, we propose a

novel value enhancement method to improve the performance of a given initial policy computed by existing state-of-the-art RL algorithms. Specifically, when

the initial policy is not consistent, our method will

output a policy whose value is no worse and often

better than that of the initial policy. When the initial

policy is consistent, under some mild conditions, our

method will yield a policy whose value converges to

the optimal one at a faster rate than the initial policy,

achieving the desired \"value enhancement\" property.

The proposed method is generally applicable to any

parameterized policy that belongs to certain

pre-specified function class (e.g., deep neural networks). Extensive numerical studies are conducted to

第43页

34

demonstrate the superior performance of our method. Supplementary materials for this article are available online.

Taming \"Data-Hungry\" Reinforcement Learning?

Stability in Continuous State-Action Spaces

Yaqi Duan

New York University

Abstract: We introduce a novel framework for analyzing reinforcement learning (RL) in continuous

state-action spaces, and use it to prove fast rates of

convergence in both off-line and on-line settings. Our

analysis highlights two key stability properties, relating to how changes in value functions and/or policies

affect the Bellman operator and occupation measures.

We argue that these properties are satisfied in many

continuous state-action Markov decision processes,

and demonstrate how they arise naturally when using

linear function approximation methods. Our analysis

offers fresh perspectives on the roles of pessimism

and optimism in off-line and on-line RL, and highlights the connection between off-line RL and transfer

learning.

Joint work with Martin Wainwright.

Invited Session IS060: Recent Developments in

Complex Time Series Analysis

Matrix Denoising and Completion Based on Kronecker Product Approximation

Han Xiao

Rutgers University

Abstract: We consider the problem of matrix denoising and completion induced by the Kronecker

product decomposition. Specifically, we propose to

approximate a given matrix by the sum of a few

Kronecker products of matrices, which we refer to as

the Kronecker product approximation (KoPA). Because the Kronecker product is an extension of the

outer product from vectors to matrices, KoPA extends

the low rank matrix approximation, and includes it as

a special case. Comparing with the latter, KoPA also

offers a greater flexibility, since it allows the user to

choose the configuration, which are the dimensions of

the two smaller matrices forming the Kronecker

product. On the other hand, the configuration to be

used is usually unknown, and needs to be determined

from the data in order to achieve the optimal balance

between accuracy and parsimony. We propose to use

extended information criteria to select the configuration. Under the paradigm of high dimensional analysis,

we show that the proposed procedure is able to select

the true configuration with probability tending to one,

under suitable conditions on the signal-to-noise ratio.

We demonstrate the superiority of KoPA over the low

rank approximations through numerical studies, and

several benchmark image examples.

Joint work with Chencheng Cai, Rong Chen.

Simultaneous Decorrelation of Matrix Time Series

Yuefeng Han

University of Notre Dame

Abstract: We propose a contemporaneous bilinear

transformation for a p x q matrix time series to alleviate the difficulties in modeling and forecasting matrix

time series when p and/or q are large. The resulting

transformed matrix assumes a block structure consisting of several small matrices, and those small matrix

series are uncorrelated across all times. Hence an

overall parsimonious model is achieved by modelling

each of those small matrix series separately without

the loss of information on the linear dynamics. Such a

parsimonious model often has better forecasting performance, even when the underlying true dynamics

deviates from the assumed uncorrelated block structure after transformation. The uniform convergence

rates of the estimated transformation are derived,

which vindicate an important virtue of the proposed

bilinear transformation, i.e. it is technically equivalent

to the decorrelation of a vector time series of dimension max(p,q) instead of p x q. The proposed method

is illustrated numerically via both simulated and real

data examples.

Joint work with Rong Chen, Cun-Hui Zhang, Qiwei

Yao.

Tensor Factor Model Estimation by Iterative Projection

Dan Yang

第44页

35

The University of Hong Kong

Abstract: Tensor time series, which is a time series

consisting of tensorial observations, has become ubiquitous. It typically exhibits high dimensionality. One

approach for dimension reduction is to use a factor

model structure, in a form similar to Tucker tensor

decomposition, except that the time dimension is

treated as a dynamic process with a time dependent

structure. In this paper we introduce two approaches

to estimate such a tensor factor model by using iterative orthogonal projections of the original tensor time

series. These approaches extend the existing estimation procedures and improve the estimation accuracy

and convergence rate significantly as proven in our

theoretical investigation. Our algorithms are similar to

the higher order orthogonal projection method for

tensor decomposition, but with significant differences

due to the need to unfold tensors in the iterations and

the use of autocorrelation. Consequently, our analysis

is significantly different from the existing ones.

Computational and statistical lower bounds are derived to prove the optimality of the sample size requirement and convergence rate for the proposed

methods. Simulation study is conducted to further

illustrate the statistical properties of these estimators.

Joint work with Yuefeng Han, Rong Chen, Cun-Hui

Zhang.

Invited Session IS045: Recent Advancements in

Large Network and Tensor Data Analysis

Statistical Foundations of Deep Generative Models

Lizhen Lin

University of Maryland

Abstract: Deep generative models are probabilistic

generative models where the generator is parameterized by a deep neural network. They are popular models for modeling high-dimensional data such as texts,

images and speeches, and have achieved impressive

empirical success. Despite demonstrated success in

empirical performance, theoretical understanding of

such models is largely lacking . We investigate statistical properties of deep generative models from a

nonparametric distribution estimation viewpoint. In

the considered model, data are assumed to be observed in some high-dimensional ambient space but

concentrate around some low-dimensional structure

such as a lower-dimensional manifold. This talk will

provide an explanation of why deep generative models can perform well from the lens of statistical theory.

In particular, we will provide insights into i) how

deep generative models can avoid the curse of dimensionality and outperform classical nonparametric estimates, and ii) how likelihood approaches work for

high-dimensional distribution estimation, especially in

adapting to the intrinsic geometry of the data.

Autoregressive Networks with Dependent Edges

Qiwei Yao

London School of Economics

Abstract: We propose an autoregressive framework

for modelling dynamic networks with dependent edges. It encompasses the models which accommodate,

for example, transitivity, density-dependent and other

stylized features often observed in real network data.

By assuming the edges of network at each time are

independent conditionally on their lagged values, the

models, which exhibit a close connection with temporal ERGMs, facilitate both simulation and the

maximum likelihood estimation in the straightforward

manner. Due to the possible large number of parameters in the models, the initial MLEs may suffer from

slow convergence rates. An improved estimator for

each component parameter is proposed based on an

iteration based on the projection which mitigates the

impact of the other parameters. Based on a martingale

difference structure, the asymptotic distribution of the

improved estimator is derived without the stationarity

assumption. The limiting distribution is not normal in

general, and it reduces to normal when the underlying

process satisfies some mixing conditions. Illustration

with a transitivity model was carried out in both simulation and two real network data sets.

Analysis of Large Networks

Jiashun Jin

Carnegie Mellon University

Abstract: The block-model family has four popular

network models: SBM, MMSBM, DCBM, and

第45页

36

DCMM. A fundamental problem is, how well each of

these models fits with real networks. We propose

GoF-MSCORE as a new Goodness-of-Fit (GoF) metric for DCMM (the broadest one among the four),

with two main ideas. The first is to use cycle count

statistics as a general recipe for GoF. The second is a

novel network fitting scheme. GoF-MSCORE is a

flexible GoF approach. We adapt it to all four models

in the block-model family. We show that for each of

the four models, if the assumed model is correct, then

the corresponding GoF metric converges to the standard normal as the network sizes diverge. We also analyze the powers and show that these metrics are optimal in many settings. For 11 frequently-used real

networks, we use the proposed GoF metrics to show

that DCMM fits well with almost all of them. We also

show that SBM, DCBM, and MMSBM do not fit well

with many of these networks, especially when the

networks are relatively large.

High-Order Singular Value Decomposition in Tensor Analysis

Anru Zhang

Duke University

Abstract: The analysis of tensor data, i.e., arrays with

multiple directions, is motivated by a wide range of

scientific applications and has become an important

interdisciplinary topic in data science. In this talk, we

discuss the fundamental task of performing Singular

Value Decomposition (SVD) on tensors, exploring

both general cases and scenarios with specific structures like smoothness and longitudinality. Through the

developed frameworks, we can achieve accurate denoising for 4D scanning transmission electron microscopy images; in longitudinal microbiome studies,

we can extract key components in the trajectories of

bacterial abundance, identify representative bacterial

taxa for these key trajectories, and group subjects

based on the change of bacteria abundance over time.

We also showcase the development of statistically

optimal methods and computationally efficient algorithms that harness valuable insights from

high-dimensional tensor data, grounded in theories of

computation and non-convex optimization.

Invited Session IS083: Limit Theory of Large Dimensional Random Matrices (大维随机矩阵极限

理论)

Nonlinear Principal Component Analysis with

Random Bernoulli Features for Process Monitoring

Dandan Jiang

Xi'an Jiaotong University

Abstract: This paper proposes a new random map,

the random Bernoulli feature, which captures nonlinear patterns in the process efficiently and quickly.

First, we derive its convergence bound for approximating the Gaussian kernel and apply the random

Bernoulli features to PCA to obtain its nonlinear variant: random Bernoulli PCA (RBPCA). Second, the

framework for implementing 15monitoring using

RBPCA is described and related to other tools such as

time-lagged structure and moving window. As a result,

three nonlinear process monitoring methods based on

RBPCA are proposed, which can extract the dynamic

properties of the process or make the model adaptive.

These methods utilizing random Bernoulli features

offer scalability and lower computational cost compared to kernel-based methods. Finally, the performance of process monitoring is demon-20 strated by a

numerical example, Tennessee Eastman Process and

Server Machine Dataset. The superiority of the nonlinear process monitoring methods based on the random Bernoulli feature is confirmed.

Joint work with Ke Chen, Shurong Zheng.

An Integrative Multi-Context Mendelian Randomization Method for Identifying Risk Genes

Across Human Tissues

Fan Yang

Tsinghua University

Abstract: Mendelian randomization (MR) provides

valuable assessments of the causal effect of exposure

on outcome, yet the application of conventional MR

methods for mapping risk genes encounters new challenges. One of the issues is the limited availability of

expression quantitative trait loci (eQTLs) as instrumental variables (IVs), hampering the estimation of

第46页

37

sparse causal effects. Additionally, the often context/tissue-specific eQTL effects challenge the MR

assumption of consistent IV effects across eQTL and

GWAS data. To address these challenges, we propose

a multi-context multivariable integrative MR framework, mintMR, for mapping expression and molecular

traits as joint exposures. It models the effects of molecular exposures across multiple tissues in each gene

region, while simultaneously estimating across multiple gene regions. It uses eQTLs with consistent effects

across more than one tissue type as IVs, improving IV

consistency. A major innovation of mintMR involves

employing multi-view learning methods to collectively model latent indicators of disease relevance across

multiple tissues, molecular traits, and gene regions.

The multi-view learning captures the major patterns of

disease-relevance and uses these patterns to update the

estimated tissue relevance probabilities. The proposed

mintMR iterates between performing a multi-tissue

MR for each gene region and joint learning the disease-relevant tissue probabilities across gene regions,

improving the estimation of sparse effects across

genes. We apply mintMR to evaluate the causal effects of gene expression and DNA methylation for 35

complex traits using multi-tissue QTLs as IVs. The

proposed mintMR controls genome-wide inflation and

offers new insights into disease mechanisms.

Joint work with Yihao Lu, Lin Chen.

Limit Theorems for U-Statistics of Determinantal

Point Process via Cumulant Estimates

Dong Yao

Jiangsu Normal University

Abstract: In this talk, we will derive the first and

second order Wiener chaos decomposition for

the U-statistics of determinantal processes associated

with spectral projection kernels on the d-dimensional

unit spheres. We first derive a graphical representation

for the cumulants of the U-statistics of any determinantal process. The main results are established by

combining precise estimates on the graph structure of

this representation with the spectral projection kernels.

The approach can be adapted to other determinantal

point processes, and similar results may hold. We also

compare our results with Hoeffding decomposition for

U-statistics of i.i.d. random variables.

Joint work with Renjie Feng and Friedrich Götze.

Quantitative Tracy-Widom Laws for Wigner and

Sample Covariance Matrices

Yuanyuan Xu

Chinese Academy of Sciences

Abstract: This talk will discuss a quantitative Tracy-Widom law for the largest eigenvalue of Wigner

matrices. More precisely, we will prove that the fluctuations of the largest eigenvalue of a Wigner matrix

of size N converge to the Tracy-Widom limit at a rate

nearly ?−1/3

, as N tends to infinity. Moreover, we

also establish a small deviation from the Tracy-Widom distribution for the largest eigenvalue of

Wigner matrices. The same results also hold true for

the largest eigenvalue of sample covariance matrices,

which plays a significant role in the principal component analysis. These are based on several joint works

with Kevin Schnelli (KTH).

Joint work with Kevin Schnelli.

Invited Session IS099: Economic Statistics and

Research on High-Quality Development (经济统计

与高质量发展研究)

Research on the Construction of Industrial Chain

and Supply Chain Network and Resilience Evaluation

产业链供应链关联网络构建及韧性评估研究

Shaohua Ge

Zhongnan University of Economics and Law

摘要:中国是推动全球工业增长的重要发展动力,

在全球产业链中扮演着关键角色,但产业链供应链

“大而不强、全而不优”等问题仍未得到根本性改变

产业链发展面临一系列问题,防范潜在危险、提升

产业链供应链韧性研究十分必要。研究基于产业关

联理论与复杂网络理论分析了产业链供应链韧性

内涵、发展及其相关特征。随后以微观视角切入,

以企业关联关系为依据构建产业链供应链关联网

络,分析节点与整体网络特征,识别重要节点,为

产业链供应链韧性评估奠定基础。其次从产业链供

应链抗风险能力与遭受风险后的恢复能力两方面

对产业链供应链韧性进行综合评估,并对其影响因

第47页

38

素进行拓展分析。研究发现,产业链供应链整体抗

风险能力不断增强,风险恢复能力呈先下降后上升

的变化,产业链供应链韧性水平呈逐年提升态势;

企业融资约束、产权比率、综合税率水平等对产业

链供应链网络节点发展具有正向影响。

ESG Performance, Financing Cost, and

High-Quality Development of Enterprises

ESG 表现、融资成本与企业高质量发展

Yating Gui

Zhongnan University of Economics and Law

摘要:融资成本对企业生存和发展的影响举足轻重

,随着资本市场对企业 ESG 信息披露的关注度大

幅提升,ESG 对于企业高质量发展的重要性也愈加

凸显。基于 2009-2020 年 3467 家 A 股上市公司的

样本数据,探讨 ESG 表现对企业高质量发展的影

响,并通过构建一个有调节的中介模型重点研究了

融资成本在两者中的作用。研究结果表明,良好的

ESG 表现有助于企业的高质量发展。机制分析表明

,融资成本在 ESG 表现与企业高质量发展之间呈

现部分中介作用;企业创新能力和市场竞争结构调

节了融资成本在 ESG 表现与企业高质量发展之间

的中介作用。异质性分析表明,ESG 表现对企业高

质量发展的促进作用在中西部地区、轻污染行业和

非国有企业中的作用更加显著。研究为准确理解和

评估企业 ESG 表现的社会和经济效应提供新思路,

丰富和拓展了 ESG 表现的经济后果和企业高质量

发展影响因素的相关研究文献,从融资成本视角为

企业高质量发展提供检验证据和政策参考。

Discussion on the Statistical Monitoring System of

Financial Security

金融安全统计监测体系探讨

Zihuan Gao

Zhongnan University of Economics and Law

摘要:作为国家安全的重要组成部分,金融安全是

关系我国经济社会发展全局的一件带有战略性、根

本性的大事。那么,何为金融安全?又应如何衡量

?本文基于对金融安全的深刻理解,在梳理国内外

与金融安全监测相关研究的基础上,提炼金融安全

的统计内涵,围绕宏观经济与金融体系的正常运转

、金融部门与经济部门的依赖关系两个方面,设置

外部风险抵御、金融系统稳定、经济部门运行 3 大

子系统,进而构建包含 9 个维度 22 项指标的金融

安全统计监测体系。该监测体系的特点是:紧扣“

安全”这一主题,既符合中国国情又具有国际视野,

指标少而精,数据可以获取。

Joint work with Hu Zhang.

Influencing Factors and Potential Measurement of

China's Export Trade under the \"Belt and Road\"

Vision

“一带一路”视域下中国出口贸易影响因素及潜力

测度

Qinqin Zhu

Zhongnan University of Economics and Law

摘要:文章基于 2012-2022 年的面板数据,运用随

机前沿引力模型对中国与“一带一路”沿线 66 个国

家出口贸易的效率与潜力进行测算,采用一步法对

影响因素进行了分析。结果表明:双方 GDP、贸易

国人口数量和贸易依存度对中国出口贸易具有正

向影响,双方的地理距离和中国人口数量具有负向

影响;对外直接投资、自由贸易协定是促进中国贸

易出口的因素,基础设施建设是阻碍中国出口贸易

的因素;中国对“一带一路”沿线国家的出口贸易效

率和潜力差异明显。本研究对“一带一路”沿线国家

以及全球各国的外资政策和区域政策有一定的参

考价值,特别是对中国企业“走出去”具有重要的指

导意义,同时也为提升“一带一路”经贸合作水平提

供了政策建议。

The Mechanism and Impact of Returning Home to

Start a Business on Improving County-Level Total

Factor Productivity: Based on the Investigation of

the Pilot Policy of Returning Home to Start a

Business

返乡创业提升县域全要素生产率的作用机制与影

响效应——基于返乡创业试点政策的考察

Huicong Wang

Zhongnan University of Economics and Law

摘要:基于 2012-2020 年全国 1789 个县域的面板数

据,运用交错双重差分模型对政府支持返乡创业这

一试点政策的实施影响县域全要素生产率的效应

与机制进行分析,结果显示:返乡创业试点政策可

以显著提高县域全要素生产率。机制分析发现,返

乡创业试点政策可以通过缓解政府干预的负面影

响、提升金融服务水平和激发县域创新活力实现县

域全要素生产率的提升。此外,返乡创业试点政策

第48页

39

在地理区位、经济基础、产业集聚水平和人力资本

水平方面表现出明显的分化特征。鉴于此,各县域

应结合自身发展特色,通过返乡创业打造新的经济

增长极,并利用好返乡创业带来的宽松市场环境、

金融支持效应和技术创新效应实现县域全要素生

产率的提升。

Joint work with Xian Zhu.

Invited Session IS091: Model Averaging and Related Topics

Model Averaging for Decomposed Data

Yuying Sun

Academy of Mathematics and Systems Science, Chinese Academy of Sciences

Abstract: The decomposition-ensemble algorithm has

received increasing attention in forecast and related

fields, especially in capturing the nonlinear and nonstationary characteristics of time series data. A conventional strategy involves decomposing the target

time series into various oscillation modes from the

frequency domain and assigning equal weights to all

decomposed modes for aggregated prediction. However, disparities in forecasting performance arise

among different decomposed modes due to their distinct attributes and forecast horizons. This paper proposes a novel frequency decomposition-based model

averaging approach to combine decomposed modes

with appropriate weights, thereby enhancing the accuracy of the target time series forecast. It is shown

that the proposed model averaging estimator is asymptotically optimal in the sense of achieving the

lowest possible quadratic prediction risk. The rate of

the selected weights converging to the optimal

weights to minimizing the expected quadratic loss is

established. Simulation studies and empirical applications to consumption and exchange rate forecasting

highlight the merits of the proposed method.

Model Averaging in Multivariate Spatial Autoregressive Model for Social Network Analysis

Fang Fang

East China Normal University

Abstract: In social network analysis, a crucial issue is

to determine how network nodes interact with each

other, which is to determine what spatial weight matrix to use under the framework of spatial autoregressive models. When the dependent variable is multivariate and there are multiple candidate weight matrices, this paper proposes a model averaging method

based on a Mallows type criterion to obtain a

weighted weight matrix estimate. When the candidate

weight matrices are all mis-specified, the method is

asymptotic optimal in the sense that minimizing the

prediction error for the dependent variable. When the

correct weight matrix is included in the candidates,

the weighted estimation is consistent. Numerical simulations verify the theoretical results and the superiority of the proposed method. Two examples on social

networks and financial networks are presented for

illustration.

Averaging Method of Poisson Regression Model

with Divergent Dimensions

发散维度的泊松回归模型平均方法

Jiahui Zou

Capital University of Economics and Business

摘要:本文提出了一种新的模型平均方法,以解决

泊松回归中的模型不确定性问题,并允许协变量的

维度随着样本量的增加而增加。我们基于 Kullback–

Leibler (KL)散度推导出了一种具有无偏性的准则,

来计算模型平均权重。研究结果表明,当所有候选

模型均被误设时,所提出模型平均估计具有渐近最

优性,即在 KL 损失的意义下渐近等价于理论最优

平均估计。当候选模型集合中存在正确模型时,本

文的模型平均参数估计具有相合性。最后,我们将

本文的方法应用在了研究公司创新的决定因素和

预测方面。

Joint work with Wendun Wang, Xinyu Zhang, Guohua Zou.

Optimal Distributed Prediction by Model Averaging

Jun Liao

Renmin University of China

Abstract: In this paper, we develop a new distributed

prediction approach. To obtain the data-driven weights

for averaging, an unbiased estimator of the squared

prediction risk is derived as the weight choice criteri-

第49页

40

on. The proposed distributed prediction is obtained by

combining the local predictions using the weights that

minimize such a criterion. The proposed method is

shown to be asymptotically optimal in the sense of

squared loss and the convergence rate is also studied.

The simulation and real data analysis show that the

prediction based on the new method is remarkably

superior to that based on the naive divide-conquer

averaging method, and usually has a close performance to the prediction using full data. Moreover, our

method even leads to better results than the prediction

using full data.

Optimal Weighted Random Forests

Dalei Yu

Xi'an Jiaotong University

Abstract: The random forest (RF) algorithm has become a very popular prediction method for its great

flexibility and promising accuracy. In RF, it is conventional to put equal weights on all the base learners

(trees) to aggregate their predictions. However, the

predictive performances of different trees within the

forest can be very different due to the randomization

of the embedded bootstrap sampling and feature selection. In this paper, we focus on RF for regression

and propose two optimal weighting algorithms,

namely the 1 Step Optimal Weighted RF

(1step-WRFopt) and 2 Steps Optimal Weighted RF

(2steps- WRFopt ), that combine the base learners

through the weights determined by weight choice

criteria. Under some regularity conditions, we show

that these algorithms are asymptotically optimal in the

sense that the resulting squared loss and risk are asymptotically identical to those of the infeasible but

best possible weighted RF. Numerical studies conducted on real-world data sets and semi-synthetic data

sets indicate that these algorithms outperform the

equal-weight forest and two other weighted RFs proposed in existing literature in most cases.

Joint work with Xinyu Chen, Xinyu Zhang.

Invited Session IS079: The Interplay Between Statistical Inference and Data-Driven Decision Making

Controlling the False Discovery Rate in Transformations: Split Knockoffs

Yuan Yao

The Hong Kong University of Science and Technology

Abstract: Controlling the False Discovery Rate (FDR)

in a variable selection procedure is critical for reproducible discoveries, which receives an extensive study

in sparse linear models. However, it remains largely

open in the scenarios where the sparsity constraint is

not directly imposed on the parameters, but on a linear

transformation of the parameters to be estimated.

Examples include total variations, wavelet transforms,

fused LASSO, and trend filtering, etc. In this work,

we propose a data adaptive FDR control in this transformational sparsity setting, the Split Knockoff method. The proposed scheme exploits both variable and

data splitting. The linear transformation constraint is

relaxed to its Euclidean proximity in a lifted parameter space, yielding an orthogonal design for improved

power and orthogonal Split Knockoff copies. To

overcome the challenge that exchangeability fails due

to the heterogeneous noise brought by the transformation, new inverse supermartingale structures are

developed for provable the FDR control with directional effects. Simulation experiments show that the

proposed methodology achieves desired (directional)

FDR and power. An application to Alzheimer's Disease study is provided that atrophy brain regions and

their abnormal connections can be discovered based

on a structural Magnetic Resonance Imaging dataset

(ADNI).

Joint work with Yang Cao and Xinwei Sun.

Balancing Personalization and Pooling: Decision-Making and Statistical Inference with Limited

Time Horizons

Yongyi Guo

University of Wisconsin-Madison

Abstract: In contrast to traditional clinical trials, digital health interventions facilitate adaptive personalized treatments delivered in near real-time to manage

health risks and promote healthy behaviors. Integrating Reinforcement Learning (RL) algorithms into

第50页

41

mHealth (mobile health) studies presents numerous

challenges, with a critical one being the constrained

time horizon leading to data scarcity, affecting decision quality, as well as the autonomy and stability of

RL algorithms in practical applications. To address

this challenge, we propose a solution for online decision-making and post-study statistical inference. Leveraging the mixed-effects reward model in Thompson

sampling, we efficiently utilize user data to expedite

informed decision-making. The online algorithm

makes traditional statistical analysis for the treatment

effect invalid: The user history are not independent

even if we assume the potential outcomes are i.i.d.

This is because the RL algorithm makes decisions

using pooled user information in addition to the user

state variables. We provide valid asymptotic confidence intervals for the average causal excursion effect

using the idea of decomposing the policy into \"population statistics\" and decisions based on \"(expanded)

user states\". As an example, I will also present the

MiWaves clinical trial, which is an AI-based mobile

health intervention to reduce cannabis use amongst

emerging adults.

Joint work with Susobhan Ghosh, Pei-Yao Hung,

Lara Coughlin, Erin Bonar, Inbal Nahum-Shani,

Maureen Walton, Susan Mruphy.

Conformal Alignment: Knowing When to Trust

Foundation Models with Guarantees

Ying Jin

Stanford University

Abstract: Before deploying outputs from foundation

models in high-stakes tasks, it is imperative to ensure

that they align with human values. For instance, in

radiology report generation, reports generated by a

vision-language model must align with human evaluations before their use in medical decision-making. We

present Conformal Alignment, a general framework

for identifying units whose outputs meet a user-specified alignment criterion. It is guaranteed that

on average, a prescribed fraction of selected units

indeed meet the alignment criterion, regardless of the

foundation model or the data distribution. Given any

pre-trained model and new units with model-generated outputs, Conformal Alignment leverages

a set of reference data with ground-truth alignment

status to train an alignment predictor. It then selects

new units whose predicted alignment scores surpass a

data-dependent threshold, certifying their corresponding outputs as trustworthy. Through applications to

question answering and radiology report generation,

we demonstrate that our method is able to accurately

identify units with trustworthy outputs via lightweight

training over a moderate amount of reference data. En

route, we investigate the informativeness of various

features in alignment prediction and combine them

with standard models to construct the alignment predictor.

Joint work with Yu Gui, Zhimei Ren.

Large Covariance Matrix Estimation with Factor-Assisted Variable Clustering

Cheng Yu

Tsinghua University

Abstract: In the field of large covariance matrix estimation, several methods have been developed based

on the factor models, assuming the existence of a few

common factors that can explain the co-movement of

asset pricing. However, many studies have demonstrated the presence of a cross-sectional correlation

between assets after removing the common factors. To

account for this effect, we propose an approximate

observable factor model with latent cluster structure,

along with a three-step estimator to accurately estimate the large covariance matrix for high-dimensional

time series. The rates of convergence of the residual

covariance with latent cluster structure and the whole

large covariance matrix are studied under various

norms. Additionally, we introduce a novel ratio-based

criteria for determining the latent cluster structure,

which can achieve clustering consistency with probability approaching one. The asymptotic results are

supported by simulation studies, and we demonstrate

the practical application of our approach through real

data analysis on minimal variance portfolio allocation.

Joint work with Dong Li, Xinghao Qiao.

Enhancing Decision Making with Causal Inference

百万用户使用云展网进行在线电子书制作,只要您有文档,即可一键上传,自动生成链接和二维码(独立电子书),支持分享到微信和网站!
收藏
转发
下载
免费制作
其他案例
更多案例
免费制作
x
{{item.desc}}
下载
{{item.title}}
{{toast}}