A few weeks ago we had our second DC F# meetup with speaker Phil Trelford where he led a hands on session introducing decision trees. The goal of meetup was to see how good of a predictor we could make of who would live and die on the titanic. Kaggle has an excellent data set that shows age, sex, ticket price, cabin number, class, and a bunch of other useful features describing Titanic passengers.
Phil followed Mathias‘ format and had an excellent .fsx script that walked everyone through it. I think the best predictor that someone made was close to 84%, though it was surprisingly difficult to exceed that in the short period of time that we had to work on it. I’d implemented my own shannon entropy based ID3 decision tree in C# so this wasn’t my first foray into decision tree’s, but the compactness of the tree … Read more