Advances In Multivariate Statistical Methods (S... [NEW]
Important statistical methods and relevant theory for analyzing continuous multivariate data are introduced. The first half of the course examines traditional and fundamental topics in some depth, and the second half of the course surveys modern topics.There is no prerequisite for this course. However, I assume students have working knowledge of probability, statistics and matrix algebra.
Advances in Multivariate Statistical Methods (S...
Stuck drill pipe is a common problem known to cost the industry hundreds of millions of dollars annually. Kingsborough et al and Hempkins et al of Chevron demonstrated the use of multivariate statistical analysis to predict the occurrence of stuck drill pipe based upon patterns in drilling parameters. This pioneering work demonstrated the application of discriminant analysis techniques to examine a large body of drilling data to find prevailing differences in drilling variables between wells which were non-stuck and wells which were stuck. Having established the patterns of variables associated with non-stuck drilling, future wells can be designed so as to emulate this non-stuck pattern.
The present work builds upon the original discriminant analysis technique. The current approach uses physically meaningful and independent combinations of raw drilling variables as arguments in the multivariate statistical analysis. This leads to the capability of being able to not only detect upcoming stuck pipe, but also to know which stuck pipe mechanism is most driving the sticking risk. Further, the present analysis allows for quantification of stuck pipe probability, as well as a method to compare this risk to risks seen in similar wells.
Ideal for non-math majors, Advanced and Multivariate Statistical Methods teaches students to interpret, present, and write up results for each statistical technique without overemphasizing advanced math. This highly applied approach covers the why, what, when and how of advanced and multivariate statistics in a way that is neither too technical nor too mathematical. Students also learn how to compute each technique using SPSS software.
The course covers methods for modern multivariate data analysis and statisticallearning, including both their theoretical foundations and practical applications. Topics include principal component analysis and other dimension reduction techniques, classification (discriminant analysis, decision trees, nearest neighbor classifiers, logistic partitioning methods, model-based methods), and categorical data analysis. There will be a significant data analysis component. (4 Credits)
Econ 671 and 672 form the basic required sequence in econometrics for all doctoral students. Their purpose is to provide Ph.D. students with the training needed to do the basic quantitative analysis generally understood to be part of the background of all modern economists. This includes: the theory and practice of testing hypotheses, statistical estimation theory, the basic statistical theory underlying the linear model, an introduction to econometric methods, and the nature of the difficulties which arise in applying statistical procedures to economic research problems. (3 Credits)
Selected topics in computational statistics including: managing and processing large data sets, parallel and distributed programming, simulation and Monte Carlo methods, interactive statistical methods, and optimization. (3 Credits)
This course will cover statistical models and methods relevant to the analysis of financial data. Topics covered will include modeling and estimation of data from heavy-tailed distributions, models and inference with multivariate copulas, linear and non-linear time series analysis, and statistical portfolio modeling. Applications from finance will be used to illustrate the methods. (3 Credits)
This is a graduate-level introductory course to key concepts, methods and theory in statistical inference. The topics covered will include univariate and multivariate families of distributions, likelihood principle, point estimation, confidence regions, hypothesis tests, large sample properties, and other selected topics in contemporary methods. (3 Credits)
The course will cover statistical methods used to analyze data in experimental molecular biology, with an emphasis on gene and protein expression array data. Topics: Data acquisition; databases; low level processing; normalization; quality control; statistical inference (group comparisons, cyclicity, survival); multiple comparisons; statistical learning algorithms; clustering; visualization; and case studies. (3 Credits)
This is an advanced introduction to the analysis of multivariate and categorical data. Topics include: (1) dimension reduction techniques, including principal component analysis, multidimensional scaling and extensions; (2) classification, starting with a conceptual framework developed from cost functions, Bayes classifiers, and issues of over-fitting and generalization, and continuing with a discussion of specific classification methods, including LDA, QDA, and KNN; (3) discrete data analysis, including estimation and testing for log-linear models and contingency tables; (4) large-scale multiple hypothesis testing, including Bonferroni, Westphal-Young and related approaches, and false discovery rates; (5) shrinkage and regularization, including ridge regression, principal component regression, partial least squares, and the lasso; (6) clustering methods, including hierarchical methods, partitioning methods, K-means, and model-based clustering. (4 Credits)
This course covers recent developments in statistical modeling and data analysis. Topics include: (1) classification and machine learning, including support vector machines, recursive partitioning, and ensemble methods; (2) methods for analyzing sets of curves, surfaces and images, including functional data analysis, wavelets, independent component analysis, and random field models; (3) modern regression, including splines and generalized additive models, (4) methods for analyzing structured dependent data, including mixed effects models, hierarchical models, graphical models, and Bayesian networks; and (5) clustering, detection, and dimension reduction methods, including manifold learning, spectral clustering, and bump hunting. (3 Credits)
This course is an introduction to mathematical optimization with emphasis on theory and algorithms relevant to statistical practice. The course covers algorithms for large-scale matrix computations, majorization-minimization methods, Newton-type methods, and stochastic approximation. It also covers optimality conditions, Lagrange duality for convex optimization problems, and convergence analysis. (3 Credits)
This course is an introduction to Monte Carlo sampling and integration methods that arise in statistics. Course topics include: basic Monte Carlo methods (random number generators, variance reduction techniques, importance sampling and its generalizations), an introduction to Markov chains and Markov Chain Monte Carlo (Metropolis-Hastings and Gibbs samplers, data-augmentation techniques, convergence diagnostics). Optional topics include: sequential Monte Carlo, Hamiltonian Monte Carlo, advanced computational methods (approximate Bayesian computation, variational inference) for complex statistical models such as latent variable and hierarchical or nonparametric Bayesian models. (3 Credits)
This course is an advanced introduction to modern statistical learning. Topics include dimension reduction techniques, including principal component analysis, factor analysis, multidimensional scaling and manifold learning; conceptual framework of classification including cost functions, Bayes classifiers, overfitting and generalization; specific classification methods including logistic regression, naive Bayes, discriminant analysis, support vector machines, kernel-based methods, generalized additive models, tree-based methods, boosting, neural networks; clustering methods including K-means, model-based clustering algorithms, mixture models, latent variable models, hierarchical models; and algorithms such as the EM algorithm, Gibbs sampling, and variational inference methods. Additional topics that may be covered include categorical data analysis, graphical models, and deep learning. (4 credits)
This course studies the process of statistical investigation. Students will learn to formulate scientific and statistical questions, analyze relevant data, and clearly communicate their findings. The emphasis is not on specific methods, but rather on scientific reasoning, collaboration, communication, and critical evaluation of findings. The course will be project based, using case studies from collaborative research and consulting. Key components of the course include: question formulation, data collection and study design, data cleaning and exploratory data analysis, model selection and validation, assessment of findings, post-hoc analysis, and conclusions, writing, communication and critical assessment, and reproducibility and replicability. (4 credits). 041b061a72