This course covers theory and practice of experimental data analysis, and will touch on three basic components: 1) The basic theory of probability and it's use in physics 1) analysis techniques all experimentalists will used every day, 3) computational methods made possible by readily available computers and 3) several advanced topics that experimentalists should be aware of. This course is level appropriate to the analysis requirements of PHY445 and PHY515 (the senior/graduate laboratory course).
This is not a course in numerical methods, but will concentrate heavily on the use of computers in data analysis. While prior programming experience will be helpful, the curriculum assumes that students are familiar with modern PCs, but have no prior programming experience. Programming will be taught as it is needed. To be successful in PHY310, students must have access to a PC and to be able to complete simple programing assignments.
Documentation for software I use in my examples
Class Calendar covering lectures and homework assignments
Example code for many topics covered during lecture.
Final exam will be due May 16 at 5pm.
PHY 310 is a three credit course. There will be regular homework, an in-class midterm, and a take-home final/project.
Class Meetings: Monday and Wednesday, 3:50-5:10pm, D-122
Grading: Homework, Midterm and the final project will be counted equally.
Measurement, analysis, and interpretation in physics
The basic mathematics of probability
Definitions: probability, distribution functions, density functions
Common distributions: binomial, Poisson, Gaussian, chi-squared
Expectation values &c: mean, mode, variance, covariance
Confidence and statistics
Data simulation techniques
Introduction to parameter estimation: least squares minimization
Linear and non-linear minimization
Hypothesis testing: Student's “t” test, chi-squared test, trials factor
Error analysis: error propagation, statistical vs systematic uncertainty
Confidence intervals: definitions, estimation, the role of assumptions
Advanced parameter estimation: maximum likelihood, robust estimators
Discriminants: maximum likelihood, Fischer's discriminant, neural nets
Statistical Data Analysis, Glen Cowan, Oxford Science Publications (1998)
This is an excellent text covering statistical data analysis from the point of view of a practicing particle physicist.
Data Reduction and Error Analysis for the Physical Sciences, 3rd edition, Bevington and Robinson, McGraw Hill
This book is required for PHY445/PHY515. It's the classic statistics introductory text for physicists. If you purchase this book, be sure to get the third edition which corrects many errors.
Probability and Statistics in Experimental Physics,2nd edition, Byron Roe, Springer (2001)
Be sure to get the second edition. This presents a very technical view of data analysis, but is very opinionated. It covers some interesting topics not usually touched upon.
In addition to the material from the text, and lectures, a significant portion of this course will be computer based. Students are expected to have access to a computer, and may use any program, or programing language they wish, as long as print-outs, CD-ROM, DVD-ROM, or EMail attachments can be handed in. Examples will be provided using several freely available programs and libraries.
PHY310 is not a programing course, but many assignments will require you to use a computer. Students in this course will need access to a symbolic algebra program, a data presentation package, and a programing language. This is a short list of some of the software that you may find useful during this class. While you won't be required these particular program, you will need access to something similar. Examples and solutions will be provided using several programs that can be downloaded and installed on a students machine. Since not all students will have programming experience, I will teach just enough to complete the homework assignments and exams. In-class examples will mostly use PYTHON (MAXIMA for symbolic algebra).
MAPLE: a symbolic analysis program available to students through the physics department office.
MAXIMA: a freely available symbolic analysis program available to students from maxima.sourceforge.net. This is the direct descendant of MACSYMA which was developed at MIT in the late 1960's and still has an active users community. I will use the wxmaxima interface for interactive symbolic algebra examples.
GNUPLOT: A freely available data plotting program that can be used on Linux, Mac OS or MS Windows. It can be downloaded from www.gnuplot.info. I strongly suggest version 4 or later. GNUPLOT can be used in conjunction with the Python SciPy package.
PYTHON: The is a freely available interpreted object oriented programing language that is becoming popular for data analysis. It is part of the default installation for most (all?) Linux distributions and Mac OS X 10. An installer for Windows can be found at www.python.org, but if you have enough disk space I recommend the installer provided by Enthought, Inc. which installs many extra useful modules by default. PYTHON will be used for examples in PHY310 (sufficient PYTHON to understand the examples will be taught as part of the lecture. There are several useful PYTHON packages that can also be installed.
matplotlib – A very good set of plotting routines that works well with the Enthought Python distributions. There is an installer for windows, rpms for Redhat, and it's part of Debian.
SciPy – A set of science oriented utilities for python. This is included in the Enthought Python distribution on windows.
ROOT: A very powerful, full featured data (and almost friendly) analysis library written in C++, and works very well with PYTHON. This library maintained by a team of programmers at CERN and is favored by the HEP and Nuclear physics community. It has also gained some popularity with the Quant investment community. It can be downloaded from root.cern.ch for Linux, OS X, or MS windows. Install the latest “pro” version (5.00.08 when I wrote this). It's a little hard to install, but well worth the effort. I will use ROOT for some analysis examples.
Any C or C++ Compiler: C++ programing will not be required for this course, but if you already know C++ it works very well with ROOT. There are several that are available. If you are using Linux, then the obvious choice will be the gnu compiler collection (and is almost certainly already installed on your machine. If you are using MS Windows, you can download and install the free version of MS Visual C++.