## Bibliography

[2] Atkinson, A. C., and A. N. Donev. *Optimum Experimental Designs*. New York: Oxford University Press, 1992.

[3] Bates, D. M., and D. G. Watts. *Nonlinear Regression Analysis and Its Applications*. Hoboken, NJ: John Wiley & Sons, Inc., 1988.

[4] Belsley, D. A., E. Kuh, and R. E. Welsch. *Regression Diagnostics*. Hoboken, NJ: John Wiley & Sons, Inc., 1980.

[5] Berry, M. W., et al. “Algorithms and Applications for Approximate Nonnegative Matrix Factorization.”
*Computational Statistics and Data Analysis*. Vol. 52, No. 1, 2007, pp. 155–173.

[6] Bookstein, Fred L. *Morphometric Tools for Landmark Data*. Cambridge, UK: Cambridge University Press, 1991.

[7] Bouye, E., V. Durrleman, A. Nikeghbali, G. Riboulet, and T. Roncalli. “Copulas for Finance: A Reading Guide and Some Applications.” Working Paper. Groupe de Recherche Operationnelle, Credit Lyonnais, 2000.

[8] Bowman, A. W., and A. Azzalini. *Applied Smoothing Techniques for Data Analysis*. New York: Oxford University Press, 1997.

[9] Box, G. E. P., and N. R. Draper. *Empirical Model-Building and Response Surfaces*. Hoboken, NJ: John Wiley & Sons, Inc., 1987.

[10] Box, G. E. P., W. G. Hunter, and J. S. Hunter. *Statistics for Experimenters*. Hoboken, NJ: Wiley-Interscience, 1978.

[11] Bratley, P., and B. L. Fox. “ALGORITHM 659 Implementing Sobol's Quasirandom Sequence Generator.”
*ACM Transactions on Mathematical Software*. Vol. 14, No. 1, 1988, pp. 88–100.

[12] Breiman, L. “Random Forests.”
*Machine Learning.* Vol. 4, 2001, pp. 5–32.

[13] Breiman, L., J. Friedman, R. Olshen, and C. Stone. *Classification and Regression Trees*. Boca Raton, FL: CRC Press, 1984.

[14] Bulmer, M. G. *Principles of Statistics*. Mineola, NY: Dover Publications, Inc., 1979.

[15] Bury, K.. *Statistical Distributions in Engineering*. Cambridge, UK: Cambridge University Press, 1999.

[16] Chatterjee, S., and A. S. Hadi. “Influential Observations, High Leverage Points, and Outliers in Linear Regression.”
*Statistical Science*. Vol. 1, 1986, pp. 379–416.

[17] Collett, D. *Modeling Binary Data*. New York: Chapman & Hall, 2002.

[18] Conover, W. J. *Practical Nonparametric Statistics*. Hoboken, NJ: John Wiley & Sons, Inc., 1980.

[19] Cook, R. D., and S. Weisberg. *Residuals and Influence in Regression*. New York: Chapman & Hall/CRC Press, 1983.

[20] Cox, D. R., and D. Oakes. *Analysis of Survival Data*. London: Chapman & Hall, 1984.

[21] Davidian, M., and D. M. Giltinan. *Nonlinear Models for Repeated Measurements Data*. New York: Chapman & Hall, 1995.

[22] Deb, P., and M. Sefton. “The Distribution of a Lagrange Multiplier Test of Normality.”
*Economics Letters*. Vol. 51, 1996, pp. 123–130.

[23] de Jong, S. “SIMPLS: An Alternative Approach to Partial Least Squares Regression.”
*Chemometrics and Intelligent Laboratory Systems*. Vol. 18, 1993, pp. 251–263.

[24] Demidenko, E. *Mixed Models: Theory and Applications*. Hoboken, NJ: John Wiley & Sons, Inc., 2004.

[25] Delyon, B., M. Lavielle, and E. Moulines, *Convergence of a stochastic approximation version of the EM algorithm*, Annals of Statistics, 27, 94-128, 1999.

[26] Dempster, A. P., N. M. Laird, and D. B. Rubin. “Maximum Likelihood from Incomplete Data via the EM Algorithm.”
*Journal of the Royal Statistical Society*. Series B, Vol. 39, No. 1, 1977, pp. 1–37.

[27] Devroye, L. *Non-Uniform Random Variate Generation*. New York: Springer-Verlag, 1986.

[28] Dobson, A. J. *An Introduction to Generalized Linear Models*. New York: Chapman & Hall, 1990.

[29] Dunn, O.J., and V.A. Clark. *Applied Statistics: Analysis of Variance and Regression*. New York: Wiley, 1974.

[30] Draper, N. R., and H. Smith. *Applied Regression Analysis*. Hoboken, NJ: Wiley-Interscience, 1998.

[31] Drezner, Z. “Computation of the Trivariate Normal Integral.”
*Mathematics of Computation*. Vol. 63, 1994, pp. 289–294.

[32] Drezner, Z., and G. O. Wesolowsky. “On the Computation of the Bivariate Normal Integral.”
*Journal of Statistical Computation and Simulation*. Vol. 35, 1989, pp. 101–107.

[33] DuMouchel, W. H., and F. L. O'Brien. “Integrating a Robust Option into a Multiple Regression Computing Environment.”
*Computer Science and Statistics*:* Proceedings of the 21st Symposium on the Interface*. Alexandria, VA: American Statistical Association, 1989.

[34] Durbin, R., S. Eddy, A. Krogh, and G. Mitchison. *Biological Sequence Analysis*. Cambridge, UK: Cambridge University Press, 1998.

[35] Efron, B., and R. J. Tibshirani. *An Introduction to the Bootstrap*. New York: Chapman & Hall, 1993.

[36] Embrechts, P., C. Klüppelberg, and T. Mikosch. *Modelling Extremal Events for Insurance and Finance*. New York: Springer, 1997.

[37] Evans, M., N. Hastings, and B. Peacock. *Statistical Distributions*. 2nd ed., Hoboken, NJ: John Wiley & Sons, Inc., 1993, pp. 50–52, 73–74, 102–105, 147, 148.

[38] Friedman, J. H. “Greedy function approximation: a gradient boosting machine.”
*The Annals of Statistics*. Vol. 29, No. 5, 2001, pp. 1189-1232.

[39] Genz, A. “Numerical Computation of Rectangular Bivariate and Trivariate Normal and t Probabilities.”
*Statistics and Computing*. Vol. 14, No. 3, 2004, pp. 251–260.

[40] Genz, A., and F. Bretz. “Comparison of Methods for the Computation of Multivariate t Probabilities.”
*Journal of Computational and Graphical Statistics*. Vol. 11, No. 4, 2002, pp. 950–971.

[41] Genz, A., and F. Bretz. “Numerical Computation of Multivariate t Probabilities with Application to Power Calculation of Multiple Contrasts.”
*Journal of Statistical Computation and Simulation*. Vol. 63, 1999, pp. 361–378.

[42] Gibbons, J. D. *Nonparametric Statistical Inference*. New York: Marcel Dekker, 1985.

[43] Goldstein, A., A. Kapelner, J. Bleich, and E. Pitkin. “Peeking inside the black box: Visualizing statistical learning with plots of individual conditional expectation.”
*Journal of Computational and Graphical Statistics*. Vol. 24, No. 1, 2015, pp. 44-65.

[44] Goodall, C. R. “Computation Using the QR Decomposition.”
*Handbook in Statistics.* Vol. 9, Amsterdam: Elsevier/North-Holland, 1993.

[45] Goodnight, J.H., and F.M. Speed. *Computing Expected Mean Squares*. Cary, NC: SAS Institute, 1978.

[46] Hahn, Gerald J., and S. S. Shapiro. *Statistical Models in Engineering*. Hoboken, NJ: John Wiley & Sons, Inc., 1994, p. 95.

[47] Hald, A. *Statistical Theory with Engineering Applications*. Hoboken, NJ: John Wiley & Sons, Inc., 1960.

[48] Harman, H. H. *Modern Factor Analysis*. 3rd Ed. Chicago: University of Chicago Press, 1976.

[49] Hastie, T., R. Tibshirani, and J. H. Friedman. *The Elements of Statistical Learning*. New York: Springer, 2001.

[50] Hill, P. D. “Kernel estimation of a distribution function.”
*Communications in Statistics – Theory and Methods*. Vol. 14, Issue 3, 1985, pp. 605-620.

[51] Hochberg, Y., and A. C. Tamhane. *Multiple Comparison Procedures*. Hoboken, NJ: John Wiley & Sons, 1987.

[52] Hoerl, A. E., and R. W. Kennard. “Ridge Regression: Applications to Nonorthogonal Problems.”
*Technometrics*. Vol. 12, No. 1, 1970, pp. 69–82.

[53] Hoerl, A. E., and R. W. Kennard. “Ridge Regression: Biased Estimation for Nonorthogonal Problems.”
*Technometrics*. Vol. 12, No. 1, 1970, pp. 55–67.

[54] Hogg, R. V., and J. Ledolter. *Engineering Statistics*. New York: MacMillan, 1987.

[55] Holland, P. W., and R. E. Welsch. “Robust Regression Using Iteratively Reweighted Least-Squares.”
*Communications in Statistics: Theory and Methods*, *A6*, 1977, pp. 813–827.

[56] Hollander, M., and D. A. Wolfe. *Nonparametric Statistical Methods*. Hoboken, NJ: John Wiley & Sons, Inc., 1999.

[57] Hong, H. S., and F. J. Hickernell. “ALGORITHM 823: Implementing Scrambled Digital Sequences.”
*ACM Transactions on Mathematical Software*. Vol. 29, No. 2, 2003, pp. 95–109.

[58] Huang, P. S., H. Avron, and T. N. Sainath, V. Sindhwani, and B. Ramabhadran. “Kernel methods match Deep Neural Networks on TIMIT.”
*2014 IEEE International Conference on Acoustics, Speech and Signal Processing*. 2014, pp. 205–209.

[59] Huber, P. J. *Robust Statistics*. Hoboken, NJ: John Wiley & Sons, Inc., 1981.

[60] Jackson,* *J. E. *A User's Guide to Principal Components*. Hoboken, NJ: John Wiley and Sons, 1991.

[61] Jain, A., and R. Dubes. *Algorithms for Clustering Data*. Upper Saddle River, NJ: Prentice-Hall, 1988.

[62] Jarque, C. M., and A. K. Bera. “A test for normality of observations and regression residuals.”
*International Statistical Review*. Vol. 55, No. 2, 1987, pp. 163–172.

[63] Joe, S., and F. Y. Kuo. “Remark on Algorithm 659: Implementing Sobol's Quasirandom Sequence Generator.”
*ACM Transactions on Mathematical Software*. Vol. 29, No. 1, 2003, pp. 49–57.

[64] Johnson, N., and S. Kotz. *Distributions in Statistics: Continuous Univariate Distributions-2.* Hoboken, NJ: John Wiley & Sons, Inc., 1970, pp. 130–148, 189–200, 201–219.

[65] Johnson, N. L., N. Balakrishnan, and S. Kotz. *Continuous Multivariate Distributions*. Vol. 1. Hoboken, NJ: Wiley-Interscience, 2000.

[66] Johnson, N. L., S. Kotz, and N. Balakrishnan. *Continuous Univariate Distributions*. Vol. 1, Hoboken, NJ: Wiley-Interscience, 1993.

[67] Johnson, N. L., S. Kotz, and N. Balakrishnan. *Continuous Univariate Distributions*. Vol. 2, Hoboken, NJ: Wiley-Interscience, 1994.

[68] Johnson, N. L., S. Kotz, and N. Balakrishnan. *Discrete Multivariate Distributions*. Hoboken, NJ: Wiley-Interscience, 1997.

[69] Johnson, N. L., S. Kotz, and A. W. Kemp. *Univariate Discrete Distributions*. Hoboken, NJ: Wiley-Interscience, 1993.

[70] Jolliffe, I. T. *Principal Component Analysis*. 2nd ed., New York: Springer-Verlag, 2002.

[71] Jones, M.C. "Simple boundary correction for kernel density estimation." *Statistics and Computing*. Vol. 3, Issue 3, 1993, pp. 135-146.

[72] Jöreskog, K. G. "Some Contributions to Maximum Likelihood Factor Analysis." *Psychometrika*. Vol. 32, 1967, pp. 443–482.

[73] Kaufman L., and P. J. Rousseeuw. *Finding Groups in Data: An Introduction to Cluster Analysis*. Hoboken, NJ: John Wiley & Sons, Inc., 1990.

[74] Kempka, Michał, Wojciech Kotłowski, and Manfred K. Warmuth. "Adaptive Scale-Invariant Online Algorithms for Learning Linear Models." Preprint, submitted February 10, 2019. https://arxiv.org/abs/1902.07528.

[75] Kendall, David G. "A Survey of the Statistical Theory of Shape." *Statistical Science*. Vol. 4, No. 2, 1989, pp. 87–99.

[76] Klein, J. P., and M. L. Moeschberger. *Survival Analysis*. Statistics for Biology and Health. 2nd edition. Springer, 2003.

[77] Kleinbaum, D. G., and M. Klein. *Survival Analysis*. Statistics for Biology and Health. 2nd edition. Springer, 2005.

[78] Kocis, L., and W. J. Whiten. “Computational Investigations of Low-Discrepancy Sequences.”
*ACM Transactions on Mathematical Software*. Vol. 23, No. 2, 1997, pp. 266–294.

[79] Kotz, S., and S. Nadarajah. *Extreme Value Distributions: Theory and Applications*. London: Imperial College Press, 2000.

[80] Krzanowski, W. J. *Principles of Multivariate Analysis: A User's Perspective*. New York: Oxford University Press, 1988.

[81] Lawless, J. F. *Statistical Models and Methods for Lifetime Data*. Hoboken, NJ: Wiley-Interscience, 2002.

[82] Lawley, D. N., and A. E. Maxwell. *Factor Analysis as a Statistical Method*. 2nd ed. New York: American Elsevier Publishing, 1971.

[83] Le, Q., T. Sarlós, and A. Smola. “Fastfood — Approximating Kernel Expansions in Loglinear Time.”
*Proceedings of the 30th International Conference on Machine Learning*. Vol. 28, No. 3, 2013, pp. 244–252.

[84] Lilliefors, H. W. “On the Kolmogorov-Smirnov test for normality with mean and variance unknown.”
*Journal of the American Statistical Association*. Vol. 62, 1967, pp. 399–402.

[85] Lilliefors, H. W. “On the Kolmogorov-Smirnov test for the exponential distribution with mean unknown.”
*Journal of the American Statistical Association*. Vol. 64, 1969, pp. 387–389.

[86] Lindstrom, M. J., and D. M. Bates. “Nonlinear mixed-effects models for repeated measures data.”
*Biometrics*. Vol. 46, 1990, pp. 673–687.

[87] Liu, F. T., K. M.
Ting, and Z. Zhou. "Isolation Forest," *2008 Eighth IEEE International Conference on
Data Mining*. Pisa, Italy, 2008, pp. 413-422.

[88] Little, Roderick J. A., and Donald B. Rubin. *Statistical Analysis with Missing Data*. 2nd ed., Hoboken, NJ: John Wiley & Sons, Inc., 2002.

[91] Mardia, K. V., J. T. Kent, and J. M. Bibby. *Multivariate Analysis*. Burlington, MA: Academic Press, 1980.

[92] Marquardt, D.W. “Generalized Inverses, Ridge Regression, Biased Linear Estimation, and Nonlinear Estimation.”
*Technometrics*. Vol. 12, No. 3, 1970, pp. 591–612.

[93] Marquardt, D. W., and R.D. Snee. “Ridge Regression in Practice.”
*The American Statistician*. Vol. 29, No. 1, 1975, pp. 3–20.

[94] Marsaglia, G., and W. W. Tsang. “A Simple Method for Generating Gamma Variables.”
*ACM Transactions on Mathematical Software*. Vol. 26, 2000, pp. 363–372.

[95] Marsaglia, G., W. Tsang, and J. Wang. “Evaluating Kolmogorov’s Distribution.”
*Journal of Statistical Software*. Vol. 8, Issue 18, 2003.

[96] Martinez, W. L., and A. R. Martinez. *Computational Statistics with MATLAB ^{®}*. New York: Chapman & Hall/CRC Press, 2002.

[97] Massey, F. J. “The Kolmogorov-Smirnov Test for Goodness of Fit.”
*Journal of the American Statistical Association*. Vol. 46, No. 253, 1951, pp. 68–78.

[98] Matousek, J. “On the L2-Discrepancy for Anchored Boxes.”
*Journal of Complexity*. Vol. 14, No. 4, 1998, pp. 527–556.

[99] McLachlan, G., and D. Peel. *Finite Mixture Models*. Hoboken, NJ: John Wiley & Sons, Inc., 2000.

[100] McCullagh, P., and J. A. Nelder. *Generalized Linear Models*. New York: Chapman & Hall, 1990.

[101] McGill, R., J. W. Tukey, and W. A. Larsen. “Variations of Boxplots.”
*The American Statistician*. Vol. 32, No. 1, 1978, pp. 12–16.

[102] Meeker, W. Q., and L. A. Escobar. *Statistical Methods for Reliability Data*. Hoboken, NJ: John Wiley & Sons, Inc., 1998.

[103] Meng, Xiao-Li, and Donald B. Rubin. “Maximum Likelihood Estimation via the ECM Algorithm.”
*Biometrika*. Vol. 80, No. 2, 1993, pp. 267–278.

[104] Meyers, R. H., and D.C. Montgomery. *Response Surface Methodology: Process and Product Optimization Using Designed Experiments*. Hoboken, NJ: John Wiley & Sons, Inc., 1995.

[105] Miller, L. H. “Table of Percentage Points of Kolmogorov Statistics.”
*Journal of the American Statistical Association*. Vol. 51, No. 273, 1956, pp. 111–121.

[106] Milliken, G. A., and D. E. Johnson. *Analysis of Messy Data, Volume 1: Designed Experiments*. Boca Raton, FL: Chapman & Hall/CRC Press, 1992.

[107] Montgomery, D. *Introduction to Statistical Quality Control*. Hoboken, NJ: John Wiley & Sons, 1991, pp. 369–374.

[108] Montgomery, D. C. *Design and Analysis of Experiments*. Hoboken, NJ: John Wiley & Sons, Inc., 2001.

[109] Mood, A. M., F. A. Graybill, and D. C. Boes. *Introduction to the Theory of Statistics.* 3rd ed., New York: McGraw-Hill, 1974. pp. 540–541.

[110] Moore, J. *Total Biochemical Oxygen Demand of Dairy Manures*. Ph.D. thesis. University of Minnesota, Department of Agricultural Engineering, 1975.

[111] Mosteller, F., and J. Tukey. *Data Analysis and Regression*. Upper Saddle River, NJ: Addison-Wesley, 1977.

[112] Nelson, L. S. “Evaluating Overlapping Confidence Intervals.”
*Journal of Quality Technology*. Vol. 21, 1989, pp. 140–141.

[113] Patel, J. K., C. H. Kapadia, and D. B. Owen. *Handbook of Statistical Distributions*. New York: Marcel Dekker, 1976.

[114] Pinheiro, J. C., and D. M. Bates. “Approximations to the log-likelihood function in the nonlinear mixed-effects model.”
*Journal of Computational and Graphical Statistics*. Vol. 4, 1995, pp. 12–35.

[115] Rahimi, A., and B. Recht. “Random Features for Large-Scale Kernel Machines.”
*Advances in Neural Information Processing Systems*. Vol 20, 2008, pp. 1177–1184.

[116] Rice, J. A. *Mathematical Statistics and Data Analysis*. Pacific Grove, CA: Duxbury Press, 1994.

[117] Rosipal, R., and N. Kramer. “Overview and Recent Advances in Partial Least Squares.”
*Subspace, Latent Structure and Feature Selection: Statistical and Optimization Perspectives Workshop (SLSFS 2005), Revised Selected Papers (Lecture Notes in Computer Science 3940)*. Berlin, Germany: Springer-Verlag, 2006, pp. 34–51.

[118] Sachs, L. *Applied Statistics: A Handbook of Techniques*. New York: Springer-Verlag, 1984, p. 253.

[119] Scott, D. W. *Multivariate Density Estimation: Theory, Practice, and Visualization*. John Wiley & Sons, 2015.

[120] Searle, S. R., F. M. Speed, and G. A. Milliken. “Population marginal means in the linear model: an alternative to least-squares means.”
*American Statistician*. 1980, pp. 216–221.

[121] Seber, G. A. F. and A. J. Lee. *Linear Regression Analysis*. 2nd ed. Hoboken, NJ: Wiley-Interscience, 2003.

[122] Seber, G. A. F. *Multivariate Observations*. Hoboken, NJ: John Wiley & Sons, Inc., 1984.

[123] Seber, G. A. F., and C. J. Wild. *Nonlinear Regression*. Hoboken, NJ: Wiley-Interscience, 2003.

[124] Sexton, Joe, and A. R. Swensen. “ECM Algorithms that Converge at the Rate of EM.”
*Biometrika*. Vol. 87, No. 3, 2000, pp. 651–662.

[125] Silverman, B.W. *Density Estimation for Statistics and Data Analysis*. Chapman & Hall/CRC, 1986.

[126] Snedecor, G. W., and W. G. Cochran. *Statistical Methods*. Ames, IA: Iowa State Press, 1989.

[127] Spath, H. *Cluster Dissection and Analysis: Theory, FORTRAN Programs, Examples*. Translated by J. Goldschmidt. New York: Halsted Press, 1985.

[128] Stein, M. “Large sample properties of simulations using latin hypercube sampling.”
*Technometrics*. Vol. 29, No. 2, 1987, pp. 143–151. Correction, Vol. 32, p. 367.

[129] Stephens, M. A. “Use of the Kolmogorov-Smirnov, Cramer-Von Mises and Related Statistics Without Extensive Tables.”
*Journal of the Royal Statistical Society*. Series B, Vol. 32, No. 1, 1970, pp. 115–122.

[130] Street, J. O., R. J. Carroll, and D. Ruppert. “A Note on Computing Robust Regression Estimates via Iteratively Reweighted Least Squares.”
*The American Statistician*. Vol. 42, 1988, pp. 152–154.

[131] Student. “On the Probable Error of the Mean.”
*Biometrika*. Vol. 6, No. 1, 1908, pp. 1–25.

[132] Vellemen, P. F., and D. C. Hoaglin. *Application, Basics, and Computing of Exploratory Data Analysis*. Pacific Grove, CA: Duxbury Press, 1981.

[133] Weibull, W. “A Statistical Theory of the Strength of Materials.”
*Ingeniors Vetenskaps Akademiens Handlingar*. Stockholm: Royal Swedish Institute for Engineering Research, No. 151, 1939.

[134] Zahn, C. T. “Graph-theoretical methods for detecting and describing Gestalt clusters.”
*IEEE Transactions on Computers*. Vol. C-20, Issue 1, 1971, pp. 68–86.