Air Force Office of Scientific Research (AFOSR), under the Department of Defense (DOD), under award number FA9550-12-1-0458; and National Institute of Standards and Technology (NIST), under Award No. 70NANB14H012.

Project Team Members:

Northwestern University

University of Michigan-Ann Arbor

Georgia Institute of Technology

Northwestern University - EECS Dept.

Predictive Modeling in Materials Science


Predictive modeling could easily be the strongest suit of machine learning and data mining. A predictive system is often trained in a supervised fashion. It takes in a collection of samples in the form of input-output pairs, and once the system parameters are learned, it can be used to derive outputs for unseen inputs. Supervised learning can be very effective, because the loss of learning is well defined: the difference between predicted outputs and given outputs in training data.

Building a prediction system has become a mundane practice now, given there are out there many well written software packages, with which, building a data model is only couple of lines of codes away. Certainly we can study tricks of training, data pre-processing, and post-processing such as ensemble techniques to improve off-the-shelf systems. But what further pursuits are there?

Context Aware Predictive Modeling

In one of our attemtps (see [1]), a regular data modeling system looks like this:

Figure 1. How one would normally start out building a data model.

And such a system is entirely unaware of the hidden distribution of data that it's trained from. To make it better, we propose context-aware systems. Such a system calls for first detecting context groups, groups that are of high intra-similarity and low inter-similarity.

Figure 2. How to add context detection into the workflow.

Deep Learning Systems

As we are writing for this page, the area of deep learning have been advancing with prime interests from many fields. Deep learning refers to the class of methods that are capable of learning hierarchy of features from raw inputs, through deep neural networks. The flexibility of structure and the often huge number of parameters (tens to hundres of millions) make those networks powerful for capturing highly nonlinear mappings between inputs and outputs. Their capability of utilizing large data has demonstrated great success with various data types like image, speech, video, text and more.

Software Download

Our theano based deep learning software package, deuNet, can be downloaded here. It is a general package that makes exploring different neural network architectures, training schemes, hyperparameters easy.



This work is supported by AFOSR (Air Force Office of Scientific Research), Department of Defense (DOD) under Award No. FA9550-12-1-0458; and by National Institute of Standards and Technology (NIST), under Award No. 70NANB14H012.

Northwestern University EECS Home | McCormick Home | Northwestern Home | Calendar: Plan-It Purple
© 2011 Robert R. McCormick School of Engineering and Applied Science, Northwestern University
"Tech": 2145 Sheridan Rd, Tech L359, Evanston IL 60208-3118  |  Phone: (847) 491-5410  |  Fax: (847) 491-4455
"Ford": 2133 Sheridan Rd, Ford Building, Rm 3-320, Evanston, IL 60208  |  Fax: (847) 491-5258
Email Director

Last Updated: $LastChangedDate: 2016-11-29 23:59:16 -0600 (Tue, 29 Nov 2016) $