The HulC: Confidence Regions from convex hulls

class: center, middle, inverse, title-slide

.title[
# The HulC: Confidence Regions from convex hulls
]
.subtitle[
## Discussant contribution
]
.author[
### <div class="line-block">Ioannis Kosmidis<br />
Professor of Statistics</div>
]
.institute[
### <div class="line-block">University of Warwick<br />
<svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:#f7f7f7;overflow:visible;position:relative;"><path d="M352 256c0 22.2-1.2 43.6-3.3 64H163.3c-2.2-20.4-3.3-41.8-3.3-64s1.2-43.6 3.3-64H348.7c2.2 20.4 3.3 41.8 3.3 64zm28.8-64H503.9c5.3 20.5 8.1 41.9 8.1 64s-2.8 43.5-8.1 64H380.8c2.1-20.6 3.2-42 3.2-64s-1.1-43.4-3.2-64zm112.6-32H376.7c-10-63.9-29.8-117.4-55.3-151.6c78.3 20.7 142 77.5 171.9 151.6zm-149.1 0H167.7c6.1-36.4 15.5-68.6 27-94.7c10.5-23.6 22.2-40.7 33.5-51.5C239.4 3.2 248.7 0 256 0s16.6 3.2 27.8 13.8c11.3 10.8 23 27.9 33.5 51.5c11.6 26 20.9 58.2 27 94.7zm-209 0H18.6C48.6 85.9 112.2 29.1 190.6 8.4C165.1 42.6 145.3 96.1 135.3 160zM8.1 192H131.2c-2.1 20.6-3.2 42-3.2 64s1.1 43.4 3.2 64H8.1C2.8 299.5 0 278.1 0 256s2.8-43.5 8.1-64zM194.7 446.6c-11.6-26-20.9-58.2-27-94.6H344.3c-6.1 36.4-15.5 68.6-27 94.6c-10.5 23.6-22.2 40.7-33.5 51.5C272.6 508.8 263.3 512 256 512s-16.6-3.2-27.8-13.8c-11.3-10.8-23-27.9-33.5-51.5zM135.3 352c10 63.9 29.8 117.4 55.3 151.6C112.2 482.9 48.6 426.1 18.6 352H135.3zm358.1 0c-30 74.1-93.6 130.9-171.9 151.6c25.5-34.2 45.2-87.7 55.3-151.6H493.4z"/></svg> ikosmidis.com  <svg aria-hidden="true" role="img" viewBox="0 0 496 512" style="height:1em;width:0.97em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:#f7f7f7;overflow:visible;position:relative;"><path d="M165.9 397.4c0 2-2.3 3.6-5.2 3.6-3.3.3-5.6-1.3-5.6-3.6 0-2 2.3-3.6 5.2-3.6 3-.3 5.6 1.3 5.6 3.6zm-31.1-4.5c-.7 2 1.3 4.3 4.3 4.9 2.6 1 5.6 0 6.2-2s-1.3-4.3-4.3-5.2c-2.6-.7-5.5.3-6.2 2.3zm44.2-1.7c-2.9.7-4.9 2.6-4.6 4.9.3 2 2.9 3.3 5.9 2.6 2.9-.7 4.9-2.6 4.6-4.6-.3-1.9-3-3.2-5.9-2.9zM244.8 8C106.1 8 0 113.3 0 252c0 110.9 69.8 205.8 169.5 239.2 12.8 2.3 17.3-5.6 17.3-12.1 0-6.2-.3-40.4-.3-61.4 0 0-70 15-84.7-29.8 0 0-11.4-29.1-27.8-36.6 0 0-22.9-15.7 1.6-15.4 0 0 24.9 2 38.6 25.8 21.9 38.6 58.6 27.5 72.9 20.9 2.3-16 8.8-27.1 16-33.7-55.9-6.2-112.3-14.3-112.3-110.5 0-27.5 7.6-41.3 23.6-58.9-2.6-6.5-11.1-33.3 2.6-67.9 20.9-6.5 69 27 69 27 20-5.6 41.5-8.5 62.8-8.5s42.8 2.9 62.8 8.5c0 0 48.1-33.6 69-27 13.7 34.7 5.2 61.4 2.6 67.9 16 17.7 25.8 31.5 25.8 58.9 0 96.5-58.9 104.2-114.8 110.5 9.2 7.9 17 22.9 17 46.4 0 33.7-.3 75.4-.3 83.6 0 6.5 4.6 14.4 17.3 12.1C428.2 457.8 496 362.9 496 252 496 113.3 383.5 8 244.8 8zM97.2 352.9c-1.3 1-1 3.3.7 5.2 1.6 1.6 3.9 2.3 5.2 1 1.3-1 1-3.3-.7-5.2-1.6-1.6-3.9-2.3-5.2-1zm-10.8-8.1c-.7 1.3.3 2.9 2.3 3.9 1.6 1 3.6.7 4.3-.7.7-1.3-.3-2.9-2.3-3.9-2-.6-3.6-.3-4.3.7zm32.4 35.6c-1.6 1.3-1 4.3 1.3 6.2 2.3 2.3 5.2 2.6 6.5 1 1.3-1.3.7-4.3-1.3-6.2-2.2-2.3-5.2-2.6-6.5-1zm-11.4-14.7c-1.6 1-1.6 3.6 0 5.9 1.6 2.3 4.3 3.3 5.6 2.3 1.6-1.3 1.6-3.9 0-6.2-1.4-2.3-4-3.3-5.6-2z"/></svg> ikosmidis  <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:#f7f7f7;overflow:visible;position:relative;"><path d="M459.37 151.716c.325 4.548.325 9.097.325 13.645 0 138.72-105.583 298.558-298.558 298.558-59.452 0-114.68-17.219-161.137-47.106 8.447.974 16.568 1.299 25.34 1.299 49.055 0 94.213-16.568 130.274-44.832-46.132-.975-84.792-31.188-98.112-72.772 6.498.974 12.995 1.624 19.818 1.624 9.421 0 18.843-1.3 27.614-3.573-48.081-9.747-84.143-51.98-84.143-102.985v-1.299c13.969 7.797 30.214 12.67 47.431 13.319-28.264-18.843-46.781-51.005-46.781-87.391 0-19.492 5.197-37.36 14.294-52.954 51.655 63.675 129.3 105.258 216.365 109.807-1.624-7.797-2.599-15.918-2.599-24.04 0-57.828 46.782-104.934 104.934-104.934 30.213 0 57.502 12.67 76.67 33.137 23.715-4.548 46.456-13.32 66.599-25.34-7.798 24.366-24.366 44.833-46.132 57.827 21.117-2.273 41.584-8.122 60.426-16.243-14.292 20.791-32.161 39.308-52.628 54.253z"/></svg> ikosmidis_</div>
]
.date[
### 31 May 2023
]

---

.notesmall {
	color: #478EC1;
}

</style>

## HulC: Pros

.notesmall[Guaranteed coverage] of at least `\(1 - \alpha\)` for a user-specified `\(\alpha\)`.
<br>
.small[(number of sub-samples grows fast, though, as level increases)]

.notesmall[Directly uses the mapping from data to parameter estimates]
(estimation procedure), instead of statements about the distribution
of that mapping or assumptions about rates of convergence.

.notesmall[Typically, fewer regularity conditions] than intervals based on the
inversion of asymptotic pivots, or bootstrap/sub-sampling.
  
- Assumes that data are realizations of independent random variables.
- Requires an upper bound of the estimator's median bias.

.notesmall[Simple to implement]. .small[(under the independence assumption)]

.notesmall[Better coverage properties through procedures for improving estimator performance], such as median bias-reducing adjusted score functions ([Kenne Pagui, Salvan, and Sartori, 2017](#bib-kennesalvansartori2017); [Kosmidis, Kenne Pagui, and Sartori, 2020](#bib-kosmidiskennepaguisartori2020)).

.notesmall[Equivariant to monotone transformations]

---

## Modelling settings

Modelling settings and estimation methods

- Models for stratified data (many nuisances)

e.g. [Sartori (2003)](#bib-sartori2003) [Bellio, Kosmidis, Salvan et al. (2023)](#bib-bellioetal2023)

- Partially-specified models

e.g. quasi-likelihoods, GEEs, composite likelihoods ([Varin, Reid, and Firth, 2011](#bib-varin2011))

- Doubly-robust estimation of causal effects

- Regularized estimation with tuning parameter selection

e.g. ridge/lasso regression with `\(p > n\)`?

- Online estimation  / Online HulC intervals?

e.g. explicit/implict SGD and variants ([Toulis and Airoldi, 2017](#bib-toulisairoldi2017))

Dependent data

- Data exhibiting spatial/temporal dependence

Ideas in [Carlstein (1986)](#bib-carlstein1986) and [Heagerty and Lumley (2000)](#bib-heagertylumley2000) can be relevant

---

## HulC: A concern

HulC intervals:

- typically, slightly wider than intervals from the inversion of asymptotic pivots
  <br>
  a small price to pay for coverage guarantees under fewer
  assumptions;

- But: depend on (the RNG seed used for obtaining) the sub-samples.

Like for other randomized confidence intervals, different random
partitions of the data yield different intervals 
<br> 
care is needed in their use for effect discovery (e.g. designed
experiments, clinical trials, observational studies, ATE, etc.)

---

## HulC and reproducibility robustness

---

## HulC and reproducibility robustness

Progress can potentially be made with HulC, because it depends
directly on order statistics from independent random variables

`\((\min_{1 \le j \le B} \hat\theta_j, \max_{1 \le j \le B} \hat\theta_j)\)`

Is it possible to reduce the variability of the endpoints due to sub-sampling?

e.g. aggregation of HulC intervals, use of properties of the distribution of order statistics to inform
the choice of B, does controlling the variance of the estimator help?, ...

---

## Further points

Are there any explicit links between the variance properties of the
estimator and HulC's performance?

Pointwise coverage guarantees `\(\longrightarrow\)` Simultaneous coverage guarantees 
<br>
controlling `\(\alpha\)` or union bound arguments can work well in parameter spaces
of fixed dimension, but perhaps not more generally?

How does the performance of HulC procedures deteriorates in
mispecified models (potentially persistent median bias)?

Is there a price to pay for using the sample twice in adaptive HulC (once
for median bias estimation and once for computing the HulC interval)?

---

## References 
.small[
<a name=bib-bellioetal2023></a>[Bellio, R., I. Kosmidis, A. Salvan, et
al.](#cite-bellioetal2023) (2023). "Parametric bootstrap inference for stratified models with
high-dimensional nuisance specifications". In: _Statistica Sinica_ 33.

<a name=bib-carlstein1986></a>[Carlstein, E.](#cite-carlstein1986) (1986). "The use of subseries
values for estimating the variance of a general statistic from a stationary sequence". In: _The
Annals of Statistics_ 14.3, pp. 1171-1179.

<a name=bib-heagertylumley2000></a>[Heagerty, P. J. and T. Lumley](#cite-heagertylumley2000)
(2000). "Window Subsampling of Estimating Functions with Application to Regression Models". En.
In: _Journal of the American Statistical Association_ 95.449, pp. 197-211.

<a name=bib-kennesalvansartori2017></a>[Kenne Pagui, E. C., A. Salvan, and N.
Sartori](#cite-kennesalvansartori2017) (2017). "Median bias reduction of maximum likelihood
estimates". In: _Biometrika_ 104.4, pp. 923-938. DOI:
[10.1093/biomet/asx046](https://doi.org/10.1093%2Fbiomet%2Fasx046). URL:
[http://dx.doi.org/10.1093/biomet/asx046](http://dx.doi.org/10.1093/biomet/asx046).

<a name=bib-kosmidiskennepaguisartori2020></a>[Kosmidis, I., E. C. Kenne Pagui, and N.
Sartori](#cite-kosmidiskennepaguisartori2020) (2020). "Mean and median bias reduction in
generalized linear models". In: _Statistics and Computing (to appear)_ 30, pp. 43-59.

<a name=bib-sartori2003></a>[Sartori, N.](#cite-sartori2003) (2003). "Modified profile
likelihoods in models with stratum nuisance parameters". In: _Biometrika_, pp. 533-549. DOI:
[10.1093/biomet/90.3.533](https://doi.org/10.1093%2Fbiomet%2F90.3.533).

<a name=bib-toulisairoldi2017></a>[Toulis, P. and E. M. Airoldi](#cite-toulisairoldi2017)
(2017). "Asymptotic and finite-sample properties of estimators based on stochastic gradients".
In: _Annals of Statistics_ 45.4, pp. 1694-1727.

<a name=bib-varin2011></a>[Varin, C., N. Reid, and D. Firth](#cite-varin2011) (2011). "An
overview of composite likelihood methods". In: _Statistica Sinica_ 21.1, pp. 5-42. (Visited on
May. 18, 2019).
]