Recently I organized an F# meetup in DC, and for our first event we brought in a wonderful speaker (Mathias Brandewinder) who’s topic was called: “Coding Dojo: a gentle introduction to Machine Learning with F#“.
I was certainly a little nervous about our first meetup, but a ton of great people came out: from experienced F# users, to people who had used other functional languages (like OCaml), to people with no functional experience. The goal of the meetup was to write a k-nearest neighbors classifier for a previously posted kaggle exercise to classify pixellated numbers.
Mathias did a great job of breaking people up into groups and then explaining what is machine learning and the criteria of the project in a surprsingly short time period. I think people were a little scared of jumping in since he only talked for about 10 to 15 … Read more
As you may have figured out, I like F# and I like functional languages. At some point I tweeted to the f# community lamenting that there was a dearth of F# meetups in the DC area. Lo and behold, tons of people replied saying they’d be interested in forming one, and some notable speakers piped up and said they’d come and speak if I set something up.
So, If any of my readers live in the DC metro area, I’m organizing an F# meetup featuring Mathias Brandewinder. We’ll be doing a hands on F# and machine learning coding dojo which should be a whole buttload of fun. Here’s the official blurb:
Machine Learning is the art of writing programs that get better at performing a task as they gain experience, without being explicitly programmed to do so. Feed your program more data, and it will get smarter at handling
In machine learning, everyone talks about weights and activations, often in conjunction with a formula of the form wx+b. While reading machine learning in action I frequently saw this formula but didn’t really understand what it meant. Obviously its a line of some sort, but what does the line mean? Where does w come from? I was able to muddle past this for decision trees, and naive bayes, but when I got to support vector machines I was pretty confused. I wasn’t able to follow the math and conceptually things got muddled.
At this point, I switched over to a different book, machine learning an algorithmic perspective.
Here, the book starts with a discussion on neural networks, which are directly tied to the equation of wx+b and the concept of weights. I think this book is much better than the other one I was reading. … Read more
At my work we use fogbugz for our bugtracker and over the history of our company’s lifetime we have tens of thousands of cases. I was thinking recently that this is an interesting repository of historical data and I wanted to see what I could do with it. What if I was able to predict, to some degree of acuracy, who the case would be assigned to based soley on the case title? What about area? Or priority? Being able to predict who a case gets assigned to could alleviate a big time burden on the bug triager.
Thankfully, I’m reading “Machine Learning In Action” and came across the naive bayes classifier, which seemed a good fit for me to use to try and categorize cases based on their titles. Naive bayes is most famously used as part of spam filtering algorithms. The general idea is you train the classifier … Read more
After following Mathias Brandewinder’s series on converting the python from “Machine Learning in Action” to F#, I decided I’d give the book a try myself. Brandewinder’s blog is great and he went through chapter by chapter working through F# conversions. If you followed his series, this won’t be anything new. Still, I decided to do the same thing as a way to solidify the concepts for myself, and in order to differentiate my posts I am reworking the python code into C#. For the impatient, the full source is available at my github.
This post will discuss the ID3 decision tree algorithm. ID3 is an algorithm that’s used to create a decision tree from a sample data set. Once you have the tree, you can then follow the branches of the tree until you reach a leaf and that will give you a classification for your sample.