Prerequisites
Graduate standing or
One semester of computer coding: Typical courses include, but are not limited to, CS1400, CS 1420, ME EN 2450, MSE 2100
And
One semester of probability or statistics. Typical courses include, but are not limited to, ME EN 2550, CS3130, ECE3530, or MATH 3070
Student Learning Objectives
Upon successful completion of this course, students shall be able to:
- Use computer programs to read and store data.
- Present data in a graphically appealing format
- Perform and interpret statistical tests and confidence intervals.
- Develop and assess the accuracy of models of data using regression.
- Classify data groups based upon clustering data.
- Implement searches for local or global optimal solutions.
- Create heuristics to enable a computer to determine improved solutions to systems.
Course Description
This course provides a broad overview of data analytics within engineering disciplines. Students will program in python and be able to read, visualize, analyze and interpret data to identify trends in order to improve and optimize systems. The topics covered include: arrays, loops, data structures, graphing data, hypothesis testing, regression, clustering, optimization search techniques and simple heuristics such as hill climbing and simulated annealing.
Sample Lectures
The class assumes that the student has never coded in Python. The casino game of craps is coded in lecture. The gambler's ruin problem is then coded. A history of wins/losses is stored to determine things like longest win streak. Through these lectures students learn about subroutines, if then statements, for loops, while loops and arrays. The next set of lectures deal with double arrays. A word search problem is coded to help students understand double arrays. We search for the word Utes forward, backword and at least one diagonal. The following lecture is the first lecture in this series of word search coding lectures.
Once the mastery of basic coding is obtained the class moves on to hypothesis testing. The class then turns to regression. The following lecture describes the premise of linear regression. The next lecture examines Dr. Easton's code to improve regression models by eliminating nonsignificant dependent variables. Rather than detail the code, which is done in a prior course lecture, the lecture describes why the code eliminates a nonsignificant variable. Both lectures are only one lecture of a series of multiple lectures on each topic.
Multiple Linear Regression Python
The class concludes with topics in heuristics. The clustering problem and the k-means heuristic are described both theoretically and in code. In a project, students cluster all the bicycle accidents with a police report in Austin, Texas for several years. The class concludes describing general heuristics and a project has the students code their own simulated annealing algorithm for the traveling salesperson problem.