Lecture 21 - Activity Recognition
In this lecture, we will discuss the training phase of activity recognition; the complete machine learning task of building a classification system that can automatically infer your activity (e.g., standing, walking, running) breaks down into two sequential phases:- First, the training phase where we collect data using the MyRuns data collector (which you are given) and label the data (e.g., walking) and then feed the resulting sensor data intoWEKA(Waikato Environment for Knowledge Analysis -- a very cool machine learning tool); which, in turn produces a classifier (called the WekaClassifier java class in our MyRuns5) that we embedded in our MyRuns5 to form a component in what is called the activity recognition "classification pipeline. Technically, this technique is called "supervised learning" in machine learning (ML) parlance.
- Second, the classification or inference phase where we embed the WekaClassifier produced by WEKA into our MyRuns5 project in a file called WekaClassifier.java -- this forms one component of the classification pipeline that runs as part of your application. Other parts of the pipeline are as follows (include the WekaClassifier):
- reading the x, y, z sensor data from the SensorManager;
- computing the magnitude from a single x, y, z reading;
- computing the FFT coefficient of the magnitude - this is called a feature;
- segmenting features into batches of 64 features in a buffer -- this is called the feature vector;
- adding the maximum value of the magnitude across all 64 individual magnitudes to the feature vector; and
- calling the WekaClassifier inputting the feature vector -- this is called the classification phase and produces an inference (e.g. walking).
What this lecture will teach you
- Training phase workflow
- Data collector
- WEKA overview
An overview of training workflow steps
The training phase is illustrated in the figure below -- don't panic: we do not expect you to write low-level signal processing functions such as compute the FFT -- we give out code to do that. The training phase boils down to the following steps at a high level:- Install WEKA on your computer
- Import the myrunsdatacollector.zip into ADT/Eclipse and load it on to your phone
- Use the collector to collect your data -- standing, walking, running (you can do other classes to -- by the way the term classes also means activity, the running class, walking class -- we simply say classes to mean, in a general sense, all activities; activities == classes. Your data will be saved in a file on your phone called features.arff -- it's a critic name hey. This is used as an input file to WEKA
- Use the file explorer in Eclipse to upload the features.arff file and store it in any folder on your computer (you can use any text editor to view the data -- as you see it is all the samples and labels, a label is for example when you tell the collector this is walking).
- Double click on the features.arff file. This will start WEKA assuming you have installed it correctly.
- Create a decision tree J48 classifier from the features.arff data
- Cut and paste the class WekaClassifier {....code...} into the WekaClassifier.java file in your MyRuns5 project. Make sure you only cut and paste the WekaClassifier class and not all the other information provided by WEKA. You will see what I mean when you get to that part.
Training workflow diagram explained
Let’s discuss this diagram in a little more. Note, there is a lot of code reuse between coding the collector (which used for training phase) and the classification phase which supports by MyRuns automatic mode. But for the training phase you do not have to write code -- as indicated in the steps above.In the training phase a method called onSensorChanged() produces sensor samples (x, y, z) in a time series (each time onSensorChanged is called), which computes m (magnitude) from the sensor samples. The workflow buffers up 64 consecutive magnitudes (m0..m63) before computing the FFT for each of the magnitudes (f0..f63) – each of these features is called an FFT coefficient; e.g., f0 and f63 are the low and highest frequency coefficients, as shown in the diagram.
As shown in the upper left of the diagram, FFT transforms a time series of amplitude over time to magnitude (some representation of amplitude) across frequency; the example shows some oscillating system where the dominant frequency is between 4-8 cycles/second called Hertz (H) – imagine a ball attached to an elastic band that this stretched and oscillates for a short period of time, or your gait while walking, running -- one could look at these systems in the time and frequency domains. The x,y,z accelerometer readings and the magnitude are time domain variables. We transform these time domain data into the frequency domain because the can represent the distribution in a nice compact manner that the classifier will use to build a decision tree model. For example, the rate of the amplitude transposed to the frequency domain may look something like the figure bottom plot -- the top plot is time domain and the bottom plot a transformation of the time to the frequency domain. Cool hey.
The training phase also stores the maximum (MAX) magnitude of the (m0..m63) and the user supplied label (e.g., walking) using the collector. The individual features are computed magnitudes (f0..f63) and MAX magnitude and label are added: collectively, we call these features the feature vector and again it comprises:
magnitudes (f0..f63), MAX magnitude, label.
Because the user can collect a small or larger amount of data, the size of the feature vector will vary (and the size and complexity of the generated WekaClassifier class will change). The longer you collect data the more feature vectors accumulated. Once the user has stopped collecting training data using the collector tool -- we move to WEKA to carry on the workflow. The output of the collector is features.arff and this file is the input to the WEKA tool. Get it? You need to make sure you install WEKA to keep moving onto the next step in the workflow.Classification pipeline
While we focus on the training phase in this lecture the diagram also shows the classification phase. MyRuns uses exactly the same workflow of the training phase in terms of computing the feature vectors (sampling x,y,z, computing the magnitudes – and MAX – and FFT coefficients) but this time there is no user supplied label as in the collector. The classification phase does not need the user to supply a label because it has trained a model (WekaClassifier) that can infer the label (e.g., walking) from the feature vectors. The pipeline of computing the feature vectors is simply shown in the diagram as new samples to be classified that is fed into the Classifier (WekaClassifier) – the result is a label (e.g., running). So in the classification phase our running MyRuns5 app sets the accelerometer data (x, y, z) computes the feature vector and passes that to the classifier WekaClassifier, and as a result your phone can now automatically infer what you are doing!MyRuns Data Collector
We now discuss the project design for the collector. As shown below the collector consists of the CollectorActivity, Globals (data) and SensorService. The project also shows the FFT.java and the Weka.jar. Again, we give you the source code for the collector and much of the processing used during the real-time classification can be also see here in the collector which is used as part of off-line training phase.The delete button removes the data, which is stored in a file on the phone (called features.arff). If you want to start all over hit delete before starting else the new collected data will be appended to the end of the file. You could collect multiple instances of training data – go for multiple runs, walks and different times and it should all be accumulated in the same file. The collector implements the training phase shown in the diagram and discussed in the section above. The result of the collector phase is the creation of a Weka formatted features.arff file. So the collector needs to be able to collect accelerometer samples, compute the magnitudes and use the FFT.java class to compute the coefficients, etc. to produce the feature vectors. The weka.jar APIs are used to format the feature vectors into to the correct format for the Weka tool that produces the classifier, as discussed in the next section.
Next, we need to upload the saved data to laptop. File Explorer in your Eclipse helps to do that.
WEKA overview
We now discuss how to use the data collected by the data collector as input to the WEKA tool to generate the WekaClassifier java class based on the J48 decision tree (a ML algorithm).Installing WEKA
Frist you need to install WEKA. Once the features.arff file is uploaded, my mac knows it is a Weka file as shown with the WEKA bird icon on the file. Clearly you need to have downloaded and installed Weka for the system to recognize the file extension. Therefore, you need to install Weka:feature.arff file format
Once WEKA is installed on your laptop and the features.arff file uploaded to your laptop then WEKA can generate the classifier. BTW, if you open features.arff using your favorite text editor you will see the Weka format of the file - interesting hey. The layout of the file makes complete sense. The first part of the file is the specification followed by the feature vector as shown figure below. The more time you send collecting data the feature vectors you will see the file.Generating the Classifier
If you click on features.arff the WEKA will fire up automatically with the following window, as shown below.If you look at the code produced by WEKA you see a bunch of if-then statements (see the code below). It is complete and you will not be able to make sense of it. To understand the code take the CS machine learning class. Until then consider is WEKA magic. This classifier is a personal classifier based only on your data. If you do this in the wild you would get a large corpus on training data from a large number of people. Why do that? Because you would like your app to work for not just you but a large population of users.
Plugging the classifier into your project
The last thing you have to do is cut and paste the class WekaClassifier{ ...} (and only the java code -- not the other stuff like the confusion metric) -- again, only the java cod; for example here is the code I trained and popped into WekaClassifier.javaclass WekaClassifier {
public static double classify(Object[] i)
throws Exception {
double p = Double.NaN;
p = WekaClassifier.N36673c9f0(i);
return p;
}
static double N36673c9f0(Object []i) {
double p = Double.NaN;
if (i[64] == null) {
p = 1;
} else if (((Double) i[64]).doubleValue() <= 10.353474) {
p = WekaClassifier.N33bd06a01(i);
} else if (((Double) i[64]).doubleValue() > 10.353474) {
p = 2;
}
return p;
}
static double N33bd06a01(Object []i) {
double p = Double.NaN;
if (i[0] == null) {
p = 0;
} else if (((Double) i[0]).doubleValue() <= 38.193106) {
p = 0;
} else if (((Double) i[0]).doubleValue() > 38.193106) {
p = WekaClassifier.N3efe0ce92(i);
}
return p;
}
static double N3efe0ce92(Object []i) {
double p = Double.NaN;
if (i[12] == null) {
p = 1;
} else if (((Double) i[12]).doubleValue() <= 1.817792) {
p = 1;
} else if (((Double) i[12]).doubleValue() > 1.817792) {
p = WekaClassifier.N1901b54e3(i);
}
return p;
}
static double N1901b54e3(Object []i) {
double p = Double.NaN;
if (i[64] == null) {
p = 2;
} else if (((Double) i[64]).doubleValue() <= 4.573082) {
p = 2;
} else if (((Double) i[64]).doubleValue() > 4.573082) {
p = 1;
}
return p;
}
}
OK. Now you have completed the training phase and imported your classifier into your project. Now you have to implement the rest of the classification pipeline -- it's described in the lab write up forMyRuns Design Doc
No comments:
Post a Comment