Category: Discussion

Shareable zsh environment: EnvZ

Introducing EnvZ.

What is Envz?

During the course of normal production development we all tend to write a bunch of shell scripts and other useful command line utilities that help us out. Usually the end up being a bunch of one offs or stored in one mega .zshrc file. However, there’s something to be said about having a small framework to share environment utilities and to use as a jump off to “version” a shared set of utilities with team mates.

With that in mind I’ve been building out a small zsh bootstrapper that builds on tools that I like and use (like yadr) and gives a way to add pluggable team modules into it. All a pluggable module is is a sym link to another directory that auto loads .sh files on shell start. And while it sounds like a small thing, it’s actually really nice to be able to have different teams version different sets of shell scripts and be able to easily link and share them with environments.

Example

For example, let me make a quick folder called team1, put in a dummy shell script and link it to my environment:

Screen Shot 2015-04-12 at 7.36.45 PM

Notice how our function exists immediately after linking the env!

To unload it:

Screen Shot 2015-04-12 at 7.38.18 PM

You can imagine now having multiple git repos that you want to share as team specific utilities or bootstrapping. All someone has to do is check out your folder and add it to their environment.

Features

Things EnvZ gives you

  • Opinionated bootstrap (installs python, pip, checks for yadr, gives you a default gui editor [atom])
  • Simplifies reloading your .zsh environment and autocompletes
  • Lets you break up your environment into multiple folders or git repos and link them to yours with simple commands with autocomplete
  • Defines a clean place to add zsh completion files
  • Auto sets up github enterprise to work with hub and provides json command line parsing with jq
  • Auto installs cask if its not there
  • A hook into your teammates source directory. If everyone uses EnvZ then you can be assured that $SRC_DIR is set
  • Easier pull request autocomplete (via the pr command, the current hub one is broken) that lets you pick the target branch and a message

I like yadr and other opinionated setups because they get you up and running fast, but I always found that once you are up and running those bootstrappers didn’t thin about how to share your configurations and other utilities. With a team of 3 or 4 people you may want to have some scripts for you, some for the team, and maybe some for side projects that you can have people bootload up.

If thats the case, EnvZ is a good option. It’s lightweight, easy to change, and easy to load up new envs with.

Enjoy!

Review of my first time experience with haskell editors

When you start learning a new language the first hurdle to overcome is how to edit, compile, and debug an application. In my professional career I rely heavily on visual studio and intellij IDEA as my two IDE workhorses. Things just work with them. I use visual studio for C#, C++, and F# development and IDEA for everything else (including scala, typescript, javascript, sass, ruby, and python).

IDEA had a haskell plugin but it didn’t work and caused exceptions in intellij using intellij 12+. Since my main ide’s wouldn’t work with haskell I took to researching what I could use.

Requirements

While some people frown on the idea of an IDE, I personally like them. To quote Erik Meijer

I am hooked on autocomplete. When I type xs “DOT” it is a cry for help what to pick, map, filter, flatMap. Don’t know upfront.

Not only that, but I want the build system hidden away, I want immediate type checking and error highlighting. I want code navigation, syntax highlighting, and an integrated debugger. I want all that and I don’t want to have to spend more than 30 seconds getting started. The reason being is that I have problems to solve! The focus should be on the task at hand, not fiddling with an editor.

In college I used VIM and while it was excellent at what it did, I found that it really wasn’t for me. Switching between the command mode and the edit mode was annoying, and I really just want to use the mouse sometimes. I also tried EMACS, and while it did the job, I think the learning curve was too high without enough “oo! that’s cool!” moments to keep me going. If I did a lot of terminal work (especially remote) then mastering these tools is a must, but I don’t. I know enough to do editing when I have to, but I don’t want to develop applications in that environment. When you find a good IDE (whether its a souped up editor or not) your productivity level skyrockets.

Getting Haskell working

Even though I’m on a windows machine I still like to use unix utilities. I have a collection of unix tools like ls, grep, sort, etc. Turns out this is kind of a problem when installing Haskell. You need to have the official Gnu Utils for wget, tar, and gzip otherwise certain installations won’t work. Also if you have tortoise GIT installed on your machine and in your path, some other unix utils are also available. To get Haskell working properly I had to make sure the GNU utils were first in the path before any of the other tools.

On top of that, I wasn’t able to get the cabal package for Hoogle to install on windows. About a week later, when I was trying to get Haskell up and running again I found this post which mentioned that they had just fixed a windows build problem.

Leksah

Once haskell was built, I turned to finding an IDE. My first google pointed me to Leksah, which at initially like exactly what I wanted. It had auto completion, error checking, debugging, etc. And it had a sizzlin dark theme that I thought was cool. I installed the 2013 Haskell platform (which contains GHC 7.6.3) and tried to run the Leksah build I got from their site. Being a Haskell novice, I didn’t know that you had to run Leksah that is compiled against the GHC version you have, so nothing worked! Leksah loaded, but I was immediately bombared with questions about workspaces, cabal files, modules, etc. This was overwhelming. I just wanted to type in some haskell and run it.

Once I figured that all out though, I couldn’t get the project to debug or any of the haskell modules to load. Auto complete also wouldn’t work.

Frustrated, I spent 2 days searching for solutions. I eventually realized I needed the right version of Leksah and found a beta build posted on in the Leksah google forums. Unfortunately this had other issues. I again couldn’t debug (clicking the debug button enabled and then immediately disabled), the GTK skin looked wonky, and right clicking opened menus 20 pixels above from where the mouse actually was.

Given all this, I gave up on Leksah.

SublimeText

The next step was sublime text with the sublime text haskell plugin. I was skeptical here since sublime text is really just a fancy text editor, but people swore by it so I gave it a shot. Here I had better luck getting things to work, but I was still unhappy. For a person new to Haskell, the exploratory aspect just wasn’t there. There’s no integration with GHCi for debugging, and I couldn’t search packages for what I wanted. Auto complete was faulty at best, it wouldn’t pick up functions in other files and wouldn’t prompt me half the time.

Still, it looked sharp and loaded fast. I was a big fan of the REPL plugin, but loading things into the REPL was kind of a pain. Also I liked all the hot keys, adding inferred types was easy, checking types was reasonably easy, but the lack of a good code navigation and proper auto completion irked me.

EDIT: I originally wrote this a few weeks ago even though it was just published today, and since then the REPL loading was fixed and so were a bunch of other bugs. In the end I’ve actually been using sublime text 2 for most of the small project editing, even though I liked the robustness of EclipseFP a lot.

EclipseFP

EclipseFP is where I finally hit my stride. Almost immediately everything worked. Debugging was great, code navigation, syntax highlighting, code time errors, etc. Unfortunately I couldn’t get the hoogle panel to work but the developer was incredibly responsive and worked me through the issue (and updated the plugin to work with the new eclipse version “Kepler”). I also enjoyed the fact that working in a file auto-loaded it into GHCi REPL so I could edit then test my functions quicker. On top of that, the developer recently submitted a pull request to the eclipse theme plugin so new dark themes will be available soon!

One thing I do wish is that the REPL had syntax highlighting like the sublimeText REPL did, but that’s OK.

Conclusion

In the end, while I can see how people more familiar with Haskell would choose the lightweight editor route (such as sublime), people new to the language really need a way to get up and running fast. Without that, it’s easy to get turned off from trudging through and learning a new language. A good IDE helps a user explore and automates a lot of the boring nastiness that comes with real development.

Linear separability and the boundary of wx+b

In machine learning, everyone talks about weights and activations, often in conjunction with a formula of the form wx+b. While reading machine learning in action I frequently saw this formula but didn’t really understand what it meant. Obviously its a line of some sort, but what does the line mean? Where does w come from? I was able to muddle past this for decision trees, and naive bayes, but when I got to support vector machines I was pretty confused. I wasn’t able to follow the math and conceptually things got muddled.

At this point, I switched over to a different book, machine learning an algorithmic perspective.

41o11gP3WpL._BO2,204,203,200_PIsitb-sticker-arrow-click,TopRight,35,-76_AA300_SH20_OU01_.jpg

Here, the book starts with a discussion on neural networks, which are directly tied to the equation of wx+b and the concept of weights. I think this book is much better than the other one I was reading. The author does an excellent job of describing the big picture of what each algorithm is, followed by how the math is derived. This helped put things in perspective for me and let me peek into the workings of each algorithm without any glossed over magic.

Neural networks

In a neural network, you take some set of inputs, multiply them by some value, and if the sum of all the inputs times that value is greater than some threshold (defined by a function) then the neuron fires. The thing that you multiply the input by is called the weight. The threshold is determined by an activation function.

Source image http://en.wikibooks.org

But, lets say the inputs to all these nodes is zero, but you want the neuron to fire. Zero times any weights is zero so the node never can fire. This is why in neural networks you introduce a bias node. This bias node always has the value of one, but it also has its own weight. This bias node can offset inputs that are all zero that should trigger an activation.

Source image http://www.codeproject.com/Articles/16419/AI-Neural-Network-for-beginners-Part-1-of-3

A common way of calculating the activation of a neural network is to use matrix multiplication. Remembering that a neuron fires if the input times the weights is above some value, to find the sum you can take a row vector of the inputs and multiply it by a column vector of the weights. This gives you a single value that you can pass to a function that determines whether you want to fire or not.

In this way you can think of a neural network as a basic classifier. Given some input data it either fires or it doesn’t. If the input data doesn’t cause a firing then the class can be thought of 0, and if it does fire then the output class can be thought of as 1.

Now it makes sense why the w and the x are bold. Bold arguments, in math formulas, represent vectors and not scalars.

Classifiers

But how does this line relate to classifiers? The point of classifiers is, given some input feature data, to determine what kind of thing it is. With a binary classifier, you want to find if the input data classifies as a one or a zero.

Here is an image representing the margin in a support vector machine

Source wikipedia

The points represent different instances, the color of the point tells you what class it is, and the x1 and x2 coordinates represent features. The point at (x1, x2) is an instance that maps to that feature set.

The line represents the point where if you took an x1 and x2 feature set, multiplied by and summed some fixed weights (so x1*w1 + x2*w2 + b), offset it by a determined bias, then everything above the line activates (neuron fires), and everything below the line doesn’t activate. This means that the line is the delineation of what is one class vs what is another class.

What confused me here is that there is a discussion of w but the axis are only in relation to x1 and x2. Really what that means is that

f(x1, x2) = w1*x1+w2*x2 + b

This is the ideal line that represents classification. w is a fixed vector here, the output function varies with the feature input. The b of the function represents the weight of the bias node: it’s independent of the input data.

wx+b defined a linearly separable classifier. In two dimensional space this is a line, but you can also classify in any other space. If your input has 3 features you can create a separating plane, and in more dimensions you can create a hyperplane. This is the basis of kernel functions (mapping feature sets that aren’t linearly separable in one space into a feature set that is linearly separable in another space: ie projection).