## HulC: Pros .notesmall[Guaranteed coverage] of at least `\(1 - \alpha\)` for a user-specified `\(\alpha\)`. <br> .small[(number of sub-samples grows fast, though, as level increases)] .notesmall[Directly uses the mapping from data to parameter estimates] (estimation procedure), instead of statements about the distribution of that mapping or assumptions about rates of convergence. .notesmall[Typically, fewer regularity conditions] than intervals based on the inversion of asymptotic pivots, or bootstrap/sub-sampling. - Assumes that data are realizations of independent random variables. - Requires an upper bound of the estimator's median bias. .notesmall[Simple to implement]. .small[(under the independence assumption)] .notesmall[Better coverage properties through procedures for improving estimator performance], such as median bias-reducing adjusted score functions ([Kenne Pagui, Salvan, and Sartori, 2017](#bib-kennesalvansartori2017); [Kosmidis, Kenne Pagui, and Sartori, 2020](#bib-kosmidiskennepaguisartori2020)). .notesmall[Equivariant to monotone transformations] --- ## Modelling settings Modelling settings and estimation methods - Models for stratified data (many nuisances) e.g. [Sartori (2003)](#bib-sartori2003) [Bellio, Kosmidis, Salvan et al. (2023)](#bib-bellioetal2023) - Partially-specified models e.g. quasi-likelihoods, GEEs, composite likelihoods ([Varin, Reid, and Firth, 2011](#bib-varin2011)) - Doubly-robust estimation of causal effects - Regularized estimation with tuning parameter selection e.g. ridge/lasso regression with `\(p > n\)`? - Online estimation / Online HulC intervals? e.g. explicit/implict SGD and variants ([Toulis and Airoldi, 2017](#bib-toulisairoldi2017)) Dependent data - Data exhibiting spatial/temporal dependence Ideas in [Carlstein (1986)](#bib-carlstein1986) and [Heagerty and Lumley (2000)](#bib-heagertylumley2000) can be relevant --- ## HulC: A concern HulC intervals: - typically, slightly wider than intervals from the inversion of asymptotic pivots <br> a small price to pay for coverage guarantees under fewer assumptions; - But: depend on (the RNG seed used for obtaining) the sub-samples. Like for other randomized confidence intervals, different random partitions of the data yield different intervals <br> care is needed in their use for effect discovery (e.g. designed experiments, clinical trials, observational studies, ATE, etc.) --- ## HulC and reproducibility robustness <img src="HulC_discussion_files/figure-html/unnamed-chunk-3-1.png" width="120%" /> --- ## HulC and reproducibility robustness Progress can potentially be made with HulC, because it depends directly on order statistics from independent random variables `\((\min_{1 \le j \le B} \hat\theta_j, \max_{1 \le j \le B} \hat\theta_j)\)` Is it possible to reduce the variability of the endpoints due to sub-sampling? e.g. aggregation of HulC intervals, use of properties of the distribution of order statistics to inform the choice of B, does controlling the variance of the estimator help?, ... --- ## Further points Are there any explicit links between the variance properties of the estimator and HulC's performance? 