Projects

“Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live.” - John F. Woods

    Popularity of Mashable Articles

    Github Octocat Code

    For my Intro to Machine Learning class, I had to devise a problem to run unsupervised learning algorithms to see how each one fared. From the Machine Learning repository at UCI I chose Popularity of Online News article dataset. This problem seems interesting because it has lot of practical uses and is challenging at the same time.

    A lot of variables effect which articles are popular and which are not. The content and style of the article along with the sociocultural and political atmosphere at the time also affect it's popularity. Popularity itself can also be defined in various ways for example, how long the articles remain relevant, how many people read the articles, how often it is shared in social media or how many comments does it generate compared to other articles. For this dataset, the data collecters had chosen popularity to mean how many shares it generated in social media. It is a good quantitative metric which was collected for over two years.

    The following are some of the attributes of the dataset:

    url, timedelta, n_tokens_title, n_tokens_content, n_unique_tokens, n_non_stop_words, n_non_stop_unique_tokens, num_hrefs, num_self_hrefs, num_imgs, num_videos, average_token_length, num_keywords, data_channel_is_lifestyle

    Two examples of the dataset:

    http://mashable.com/2013/01/07/astronaut-notre-dame-bcs/, 731, 9, 531, 0.503787878, 0.999999997, 0.665634673, 9, 0, 1, 0, 4.404896422

    http://mashable.com/2013/01/07/beewi-smart-toys/, 731, 10, 370, 0.559888578, 0.999999995, 0.698198195, 2, 2, 0, 0, 4.359459459

    The target value is the number of shares in the social media which ranges from 1 to 800K so, instead of using the raw values I used standard normalization to fit all of the shares into a smaller range which goes from -0.29194 to 72.23. Hence, a demarcation has been created where articles that have standard shares below -0.225 are unpopular and above 0.1 are popular and those in between are neutral.

    Initially, the data had a huge number of neutral dataset which caused the classifiers to over-fit for the neutral class. Hence, I did a random removal of some of the neutral data for training so that all the classes were equally represented in the training set. This caused the accuracy of the classifiers to increase by more than 15%.

    Unpopular Neutral Popular Total No. Features
    Training 3303 2756 3438 9497 59
    Testing 1166 873 1127 3166 59
    Holdout Set 1148 870 1148 3166 59

    Out, of the 15K data 60% data has been used for training, 20% has been set out for testing and the other 20% has been set as a holdout set which is used for 1-fold cross validation. The dataset has been dummy coded with values of 0 for Unpopular, 1 for Neutral and 2 for popular.

    Decision Tree Classifier

    DecisionTreeClassifier from Scikit-learn was used to train and do a three way classification. Initially, the tree was expanded to fit all the data which created a tree of height of 31 and a baseline accuracy of 63.9%. After that the tree was pruned iteratively from 33 to 2 to figure out which gave the best possible accuracy which took about 10 sec. The tree of height 6 gave the best accuracy score on the holdout set of 72%.

    Decision Tree Complexity Decision Tree Accuracy

    Max height of 6 was used to create the final classifier and used for testing and cross validation. From the cross validation graph we can see that as the number of data is increased accuracy on training data decreases whereas accuracy on testing set increases. It suggests that as more data is included the accuracy of the classifier should increase.

    Boosting

    Adaboost classifier from Scikit-learn was used for ensemble learning to boost the DecisionTreeClassifier. There are two major hyper-parameters to modify depth of the decision tree and number of trees used. I have plotted the depth vs. time and n_estimators vs time complexity graph. It seems that they are directly proportional hence, increasing the depth and number of estimators takes a lot longer to fit and predict but, also gives better prediction.

    Boosting Complexity Boosting Time Depth

    Hence, a tree with depth of 7 and 10,000 estimators were used to train the AdaBoost classifier. It took about 30 mins to run and predict. It gave an accuracy of 75% on the testing set which is better than all the other algorithms. The following graph shows the learning curve for the boosted classifier and the accuracy metrics:

    Boosting Accuracy

    As we can see on the graph above, the classifier has been over-fit with the training data hence very high accuracy on the training set. But, the testing set accuracy has also increased by a little which suggests that feeding more data to train would increase the accuracy.

    Neural Network

    MultilevelPerceptron from Weka was used to run the neural network with number of hidden nodes as 30 using backpropagation to find the weights. The number of hidden layer was chosen based on the number of input attributes which is 57. This classifier was run for various epoch which gave different accuracy rate which has been plotted below:

    Neural Learning Neural Complexity

    From the above complexity curve we can see that increasing the number of epochs gives better accuracy but after 400 epochs accuracy decreases. Hence, 400 epoch were used to generate the cross validation learning curve for the neural network. The model was used to find a cross-validation error which gave an accuracy of 68% using all the training data. We can see from the learning curve that as number of training samples is increased the training accuracy increases which suggests the model is being over fitted

    Support Vector Machine

    SVC module from Scikit-learn was used to predict the three way classification of popularity for online articles. I have compared linear and polynomial kernel by switching them out. I have compared penalty or ‘C’ factor for linear kernels and different degree polynomials for the polynomial kernel.

    SVM Linear Learning SVM Poly Learning

    We can see that for linear kernel Penalty ‘C’ of 1.5 gives the best accuracy of 72.5 % on the holdout set. Similarly, for the polynomial kernel the best accuracy of 73.6% is given by the polynomial of degree 3. This information is used to create the best classifier and do a cross validation and testing:

    SVM Linear Learning SVM Poly Learning

    Looking at the graph above and the metrics we can safely conclude that linear kernel is better by few percentage points and could be improved by adding more data. The learning and testing score both increase when adding more data

    K-Nearest Neighbors

    By running the training set, on a default KNeighborsClassifier with the default nearest neighbors of 5 it gave an accuracy of 66% and the dataset with 25 best features also gave an accuracy of 66% on the holdout set. To figure out the optimum number of neighbors the number of neighbors were iteratively increased and the accuracy of the hold-out set was calculated. 43 nearest neighbors gave the highest accuracy for the holdout set with all 59 attributes included whereas 12 nearest neighbor gave the best result for dataset with only 25 attributes included which were 68.5% and 68.3% respectively.

    25 KNN Complexity All KNN Complexity

    The following chart show the accuracy against the number of neighbors for both the sets with all the attributes and the cross validation learning curves.

    25 KNN Complexity All KNN Complexity

    We can see that holding out attributes does not make a difference in the accuracy of the prediction and the training time is also only different by 10 sec. Hence, using all the features is better.

    Conclusion

    Popularity prediction of online news article can be done with 75% accuracy based on the ensemble boosting method. However, it takes a lot of time to train and test. Increasing the number of estimators increases the accuracy of the classifier but, by a very small margin as can be seen by going from 5000 estimators to 10,000 increased the accuracy by 1% whereas the training time increased by a whole lot.

    It can also be seen that Decision Trees give similar results but, with very little training time. Decision trees correctly classified the tweets 73% of the time with very little overhead. Using only the best 25 features also increased the testing time because decision tree also uses information gain for feature selection.

    Neural network did not perform on par with other algorithms. It actually performed worse than other classifiers. It could be because there are some noise in the data which is causing it to misrepresent some of the weights. There are some articles on this dataset that are hyper popular which could be causing the weights to be non-optimal.

    Linear SVM also performed on par with the Decision Tree algorithm and it was very fast. It gave an accuracy of 73% on the training set which could mean that the data is linearly separable. KNN gave an average score of 69% with all the features and with only 25 features however, it performed the same meaning the data points that are in similar distance did not change based on the number of features.

    References

    Caruana, Rich, and Alexandru Niculescu-Mizil. "Data mining in metric space: an empirical analysis of supervised learning performance criteria." Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2004.

    Hearn, Alison. "Structuring feeling: Web 2.0, online ranking and rating, and the digital ‘reputation’economy." ephemera 10.3/4 (2010): 421-438.

    Tatar, Alexandru, et al. "From popularity prediction to ranking online news."Social Network Analysis and Mining 4.1 (2014): 1-12.

    Online News Popularity. (n.d.). Retrieved September 21, 2015. http://archive.ics.uci.edu/ml/datasets/Online+News+Popularity

    Github Octocat Check out the code

    Randomized Optimization of Neural Network

    Github OctocatCode

    Backpropagation is one of the most widely used algorithm to find weights of a MultilevelPerceptron or a neural network. It is used almost everywhere to calculate the weights of a network but, there are also other ways to find the weights. The problem of finding the weights can also be thought of as a search problem where the weights are in an infinite space and we apply various randomized algorithms to search that space.

    The problem that the neural network is trying to solve is to classify Mashable articles into one of popular, neutral, or unpopular groups. This is the same problem as discussed in supervised learning of Mashable articles.

    Here, I am using three different random optimization algorithms to try and find a good weight for the neural network and see how each one performs in classifying the Mashable articles. All of the experiments are run through a JAVA package called ABAGAIL. I forked the code into my repository and ran the following builtin optimization algorithms:

    Initially, the number of iterations were changed on the various optimization algorithms while keeping the data set the same. The iterations were changed from 200 all the way to 6000 in order to see if the accuracy score of the neural network would improve. We can see the accuracy score of each of the optimizations below

    Back Propagation

    MultilevelPerceptron from Weka was used to run the neural network with number of hidden nodes as 30 using backpropagation to find the weights. The number of hidden layer was chosen based on the number of input attributes which is 57. This classifier was run for various epoch which gave different accuracy rate which has been plotted below:

    Neural Learning Neural Complexity

    Random Hill Climbing

    We can see from the graphs below that the accuracy of the Neural Network did not increase linearly with the increase in the number of iterations. Another interesting thing to notice is that the lower number of iterations gave better test score than more number of iterations. Random Hill Optimization with less than 1000 iterations gave an accuracy score of 57% whereas increasing the number of iterations did not increase the accuracy of the network.

    Random Hill Climbing Random Hill Learning

    We can conclude that the increasing the number of iterations does not increase the testing accuracy however, the time it takes to run each simulation increases significantly with the number of iterations. Therefore, by keeping the number of iterations constant and changing the number of training data the learning curves were obtained.

    With random hill climbing methods, as the number of sample size is increased both the training and testing error decreases significantly. But, with increasing sample size we can see that the testing accuracy increases which means that the random optimization is able to find good weights. However, the weights for neural network are not as optimal as given by the back propagation algorithm. This could have been improved by getting the average score based on many runs instead of just using one single run. The drawback for this is that it is computationally expensive and takes a lot of time.

    Simulated Annealing

    Simulated annealing optimization gives much better accuracy than the other two algorithms. The testing accuracy increases with the number of sample size as the bias of classifier decreases. It seems that providing even more data would have increased the accuracy much further. It could be explained by the fact that instead of only doing hill climbing it also does some hill descending. It gave the best accuracy of about 57.5%. It is better than other three algorithms but, not as good as the back propagation algorithm to train the neural network. Similarly, for Simulated Annealing optimizations the testing accuracy does not increase with the increase in number of iterations.

    Simulated Iteration Simulated Learning

    Genetic Algorithm

    Genetic algorithm does not show much variation in the testing accuracy with the number of sample size. However, we can see that the training accuracy decreases much faster than the testing accuracy which seems that the network trained with the genetic algorithm is not learning the training set itself but, figuring out the data set. We can see that the training accuracy decreases with the sample size which means the data was over-fit and now it’s an under-fit. However, even with the genetic algorithm the accuracy is only about 57% whereas with the back propagation algorithm the accuracy was about 73%. Interestingly, for genetic algorithm the testing accuracy actually decreases with the increase in accuracy.

    Genetic Iteration Genetic Learning

    Conclusion

    The poor performance of the optimization algorithms could be explained by stochastic errors where the starting point of each algorithm is randomly selected. Other factors could have been the presence of local optima where the algorithms are getting stuck. Each of the algorithms seem to be fixed on the accuracy of around 57% which suggests that the weight that have been converged to are the same. The subpar performance of the Genetic Algorithm can be explained by The Fundamental Theorem of Genetic Algorithms which assumes an infinitely large population. In this experiment we had to fix the population size such that the algorithms complete in a finite amount of time and the lack of very large dataset also could have been a hurdle.

    References:

    Baluja, Shumeet, and Rich Caruana. "Removing the genetics from the standard genetic algorithm." Machine Learning: Proceedings of the Twelfth International Conference. 1995.

    Isbell Jr, Charles L. "Randomized Local Search as Successive Estimation of Probability Densities." A longer tutorial version of the 1997 paper on MIMIC that includes a derivation for MIMIC with trees. Can be accessed at http://www. cc. gatech. edu/∼ isbell/tutorials/mimic-tutorial. pdf.

    Chen, Luonan, and Kazuyuki Aihara. "Chaotic simulated annealing by a neural network model with transient chaos." Neural networks 8.6 (1995): 915-930.

    Caruana, Rich, and Alexandru Niculescu-Mizil. "Data mining in metric space: an empirical analysis of supervised learning performance criteria." Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2004.

    Online News Popularity. (n.d.). Retrieved September 21, 2015. http://archive.ics.uci.edu/ml/datasets/Online+News+Popularity

    Github Octocat Check out the code

    The Good, the Bad, and the Ugly Tweets

    Github OctocatCode

    Sentiment classification of written text is a challenging problem for machines to master. Accurately identifying whether the author of the text is implying a positive or negative sentiment is even more difficult without the text conveying a lot of emotions. Additionally, some aspects of human emotions like humor and sarcasm do not translate well into written text. Classifying tweets is even more challenging because of the limited number of words that can be represented in 140 characters. Also, the problem gets even more challenging for a machine because of the abundance of emoticons, spelling errors, short forms, internet slangs, interspersed numbers instead of grammatically correct sentences.

    This classification problem is interesting because of the challenges it faces and also the abundance of applications if done correctly. The problem is akin to computers’ understanding of human communication. Even though human emotions are not always binary or expressed in terms of negative or positive sentiment it is the basic building block of human emotion. If machines are able to decipher sentiment it could make our interaction with them much smoother. Additionally, we can leverage the power of computers to analyze text in a large scale.

    In this regard, even though the problem is challenging to solve it has a lot of rewards. The data used for the classification problem consists of tweets that have been hand-labelled either positive or negative based on the sentiment conveyed by them.

    The downloaded data contained random samples of tweets from a lot of users and has been categorized into positive and negative based on the sentiment expressed by the tweet. So, it could be biased by the emotions of the person categorizing the tweets. In order to process the data he data has been stripped of any emoticons and only contains alpha-numeric characters.

    Examples of positive tweets:

    • Juuuuuuuuuuuuuuuuussssst Chillin!!
    • thanks to all the haters up in my face all day! 112-102
    • Feeling strangely fine. Now I'm gonna go listen to some Semisonic to celebrate http://tumblr.com/xcn21w6o7

    Examples of negative tweets:

    • i think mi bf is cheating on me!!! T_T
    • Omgaga. Im sooo im gunna CRy. I've been at this dentist since 11.. I was suposed 2 just get a crown put on (30mins)...
    • I missed the New Moon trailer...
    • this weekend has sucked so far

    Out of the almost 12K data points there are roughly 4500 tweets that have been labelled negative and the rest of 5428 tweets have been labelled positive. I have divided the dataset into 8K for training and 2K for testing and also kept another 2K as a holdout set for 1 fold validation. Tweets have been scrubbed for numbers and other punctuations so that only space separated words remain in the tweets. For example the following tweet:

    “.. Omgaga. Im sooo im gunna CRy. I've been at this dentist since 11.. I was suposed 2 just get a crown put on (30mins)...”

    Is scrubbed by removing numbers and punctuations as such:

    “Omgaga Im sooo im gunna CRy Ive been at this dentist since I was suposed just get a crown put on mins”

    The data is broken down into word vectors using a normal stemmer and getting rid of English stop words. Word count alone is not a good measure to classify positive and negative sentiments because common words of English show up on both the positive and negative tweets. Instead, a vector of its term frequency against its inverse document frequency (TF-IDF) is created such that unique words in both the negative and positive tweets have more weights. The vector consists of sparse data with mostly 0’s. Once the data has been dummy coded for classification 0 for Negative sentiment and 1 for positive sentiment it is then fed into various learning algorithms.

    Dataset contains 12,000 tweets which are divided into sixty percent training set and twenty percent of it into testing set and a holdout set similar to training set which is used for 1-fold cross validation.

    Data Set:

    Rows Features
    Training 8000 15065
    Testing 2000 15065
    Holdout 2000 15065

    Decision Tree

    The dataset was then fed into a decision tree to analyze if the tweets could be divided into positive and negative sentiment. DecisionTreeClassifier from Scikit-learn library was used without specifying the ‘max_depth’ parameter such that the tree expands until all the leaves are pure. It also uses Gini impurity by default. The classifier gave an accuracy of 64% with the depth of 1500 nodes.

    Since the tree was too big and did not give impressive results the tree was iteratively pruned. The tree was pruned and the accuracy was calculated against the hold out set. The pruning accuracy has been plotted in the graph below. It shows that decision tree of depth 450 gave better accuracy on the hold out set to the tree with 1500 node depth.

    Decision Tree Complexity Decision Tree Accuracy

    Since, pruning from 1500 node depth to 450 node depth increased accuracy from 64% to 65% that was used to build the final classifier. A 5 way cross validation was also done with the training set and the graph which is included above. Where we can see that as the size of the data is increased accuracy of training set decreases and accuracy of testing set increases. The classifier with 450 max_depth was built and tested on the test set of 2000 rows.

    Pruning the tree did not change the accuracy by a lot because most important attributes are used near the root of the tree. Even though this has a high number of features there are a limited number of words that are positive or negative. These attributes are closer to the root as such pruning does not improve the accuracy by a lot.

    Boosting

    Boosting was implemented through the AdaBoost classifier in the Scikit-learn library. In order to model the complexity various classifiers were trained by varying the number of estimators and also by varying the depth of the base estimators. The answers have been plotted below to show the effect of changing depth and number of estimators. Depth was kept constant while varying the number of estimators and vice versa.

    It can be inferred from the curves that changing the number of estimator increases the time needed to train the algorithm. Additionally, the accuracy of the classifier also increases. However, changing the depth of the base estimator does not always increase the accuracy on the testing set. It could be because the base estimator is not optimal and creating

    Boosting Complexity Boosting Accuracy

    After the initial investigation various numbers of combination were tried amongst them base estimator with a depth of 4 and 5000 estimators gave the best results in a time of 7 hours. It gave an accuracy of 68% which is better than other classifiers.

    Boosting Learning Curve

    Increasing the number of estimators does not increase the accuracy by a whole lot but, it does increase the training time exponentially. It could be because the model gets over-fitted with the training data which causes the model not to be a good predictor on the testing set.

    Neural Network

    MultilevelPerceptron from Weka was used to simulate a neural network to classify tweets. Tweets were split into words and converted to a vector in order to pass it to the perceptron. Due to big size of features the word vector must have been very big because MultilevelPerceptron was not able to get better result than 50%. With each increasing epoch the time taken to build the classifier increases but, accuracy does not increase more than 50%. As we can see in the curve below:

    Neural Complexity Neural Learning Curve

    Even, after running for 900 epochs accuracy does not increase and stays around 50% meaning it misclassifies most of the data. It clearly suggests that sentiment classification is not suitable for back propagation on a feed forward network. Looking at the confusion matrix for epoch of 1000 we can see that even though the training set contains equal number of positive and negative sentiment tweets. Network is biased towards Negative tweets and classifies each one to be negative.

    Negative Predicted Positive Predicted
    True Negative 200 50
    True Positive 201 50

    Support Vector Machine

    Support vector machines with both linear kernel and polynomial kernel was used to analyze the dataset. SVC module from Scikit-library was used to fit and train on the data set. Initially, SVC with a linear kernel was run with default setting which gave an accuracy of 53% on the testing set while polynomial kernel with degree three gave an accuracy of 51% on the same test.

    By iterating over the penalty or ‘C’ parameter model for linear SVC the complexity was analyzed as shown in the graph below. The similar thing was done on the Polynomial kernel but, instead of changing the penalty the degree of the polynomial was changed.

    SVM Linear Complexity SVM Poly Complexity

    Polynomial kernel SVM gave various accuracy holdout set as the penalty was increased but, as can be seen on the graph changing the degree of the kernel did not change the accuracy of the classifier at all. Linear kernel SVM gave an accuracy of 69% on the holdout set but, Polynomial kernel was at 53%.

    Hence, linear kernel with the ‘C’ parameter of 0.75 was chosen for cross validation and do testing. Polynomial kernel with degree two was chosen to classify the testing set. These two classifiers were then chosen to do cross validation.

    SVM Linear Complexity SVM Poly Complexity

    The learning graphs show that the dataset is linearly separable also show that the dataset has very low degree because the kernel of degree two also performs much better than higher degree kernels. The linear kernel’s testing accuracy can be increased by adding more data.

    K-Nearest Neighbors

    KNeighborsClassifier from Scikit-learn was used to do run the KNN classification. The classifier was trained on the simplest KNeighborsClassifier with the default nearest neighbors of 5. It gave an accuracy of 55% which did not seem promising. So, in order to figure out the optimum number of neighbors training data was fit iteratively and the number of neighbors were increased. The accuracy of the classifier was then compared to the hold-out set. The following fig. _ shows the accuracy against the number of neighbors for the hold out set.

    KNN Complexity KNN Learning Curve

    Changing the number of neighbors to 281 gave it the accuracy of 63% on the holdout set. It took 5 mins to iteratively compare the accuracy of KNN for different depth from 5 to 800. As more neighbors are compared bias increases as the testing data tends to get away from the true result which causes the accuracy to go down.

    The classifier with 281 neighbors was fit with training data and a cross validation was done and plotted in the figure above. We can see from the chart above that adding more data on training would have improved the accuracy of the classification. KNN gives similar score to other learning algorithms. It can be improved by giving more data to train.

    Conclusion

    Ensemble method is the winner for classifying the tweets into positive and negative group among the ones experimented. It has the highest accuracy of 68%. Only downside to boosting is the amount of time needed to run the training. Training time increases exponentially to the number of estimators. Decision tree is close second with 64% even though SVM and KNN classifier give similar performance of 63% on the testing set.

    This type of data contains huge number of attributes because of various terms appearing on the tweet that are not part of the Standard English. A lot of variation and misspellings for the same word also make it difficult to narrow down the features. For example: 'aaaa', 'aaahhhh', 'aaaw', ‘aaahhaa’ mean something similar to us as we interpret the tweets but, same emotion can be expressed in various forms.

    Furthermore, neural network did not perform at all. It could be because Weka only provides back propagation on a perceptron whereas sentiment analysis is better handled with a recurrent network.

    Even though, the best accuracy is 68% on the dataset it is pretty good given the dataset with 15000 features. The accuracy also could be improved by doing some natural language processing by comparing synonyms, creating a list of positive and negative attributes and giving them more weight. Another feature that could be added is to convert the emoticons into words like happy, sad, angry etc.

    References

    Agarwal, Apoorv, et al. "Sentiment analysis of twitter data." Proceedings of the Workshop on Languages in Social Media. Association for Computational Linguistics, 2011.

    Caruana, Rich, and Alexandru Niculescu-Mizil. "Data mining in metric space: an empirical analysis of supervised learning performance criteria." Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2004.

    Kumar, Akshi, and Teeja Mary Sebastian. "Sentiment analysis on twitter."IJCSI International Journal of Computer Science Issues 9.3 (2012): 372-378.

    Martínez-Cámara, Eugenio, et al. "Sentiment analysis in twitter." Natural Language Engineering 20.01 (2014): 1-28.

    Twitter Sentiment Analysis Training Corpus (Dataset). (n.d.). Retrieved September 21, 2015, http://thinknook.com/twitter-sentiment-analysis-training-corpus-dataset-2012-09-22/

    Github OctocatCode

    Contact Lens Reminder Android App

    Github OctocatCode

    During the summer of 2012, a friend of mine and I got together and created an android app. Main purpose of the app was to set a reminder for when to change the contact lens. It was the only free app to do that at that point in time.

    The code has been hosted by my friend in his github account. Github OctocatCode

    Disclaimer: The app has been discountinued from the app store.

    Augmenting Natural User Interface

    Github Octocat Code

    One hot summer in College Station, I received Undergraduate Summer Research Grant (USRG) from Texas A&M University to perform research. I settled down in the Embedded Systems and CoDesign Lab under Dr. Rabi Mahapatra as my advisor to learn about the embedded system.

    Back then, Microsoft Kinect had recently been unvieled so, we decided on augmenting some of it's capability with a System on Chip embedded system. Hence, I got a real first hand experience of making an embedded system work for me. It started like any other project where I had to learn the basic functionality by running through the demo. The system that was available at the time was Programmable System on Chip (PSoC) supplied by Cypress Semiconductor.

    Here is the setup for both the systems:

    Schematic

    At that time, Microsoft had not opened their API for Kinect so, I had to use the OpenNI interface that people had created to communicate with Kinect. But, it was a good learning experience to see the inner workings of how the framework was laid out. I had some problem installing the software and get it working. After some searching the internet, going through the google group page and blogs from other enthusiasts I was able to get over the hump. I was able to write a painting application to take input from the hands via Kinect and draw on the canvas.

    Another part of the system was the PSoC which was going to talk to Kinect via my code which listened on the serial port. I had to open a serial communication port and listen on it. I had two pieces of PSoC working in tandem. One was the wireless reciver which was hooked up to the computer and other one was a controller which had capacitors on it. Since, it had three capacitors it could be used for selecting from 3 variables and would talk to the other PSoC over the air. I had some difficulty setting it up but, after some help from other students I was able to get it working.

    Once it was all setup I could move my hand around and paint on the computer via Kinect and select the paints and colors based on the PSoC I was wearing on my hand. Since, it had three capacitors I was able to use it as Red, Green and Blue (RGB) values and select the colors based off of that.

    It was interesting and challenging project as I did not have a lot of programming experience with the embedded systems. However, I was able to complete the task and present my research work. I also had to write a paper on that which has been included below.

    PDF IconComplete Report

    Github Octocat Check out the code