May, 2013

Tech Talk: Sorting of ratings

Today’s tech talk discussed different ways to sort ratings system. The topic revolved around a blog post we discovered a while ago breaking down different problems with star based sorts.

The article describes a few problems:

Rating type: Good – Bad

The issue here is that when you use only the difference in positive vs negative ratings, you get skewed results to highly popular (but also maybe highly disliked) items. For example, an item that has upvotes of 200, but downvotes of 50 would have a score of 150. However, an item who has 125 upvotes and no downvotes would be technically scored lower here. The team and I agreed that this isn’t a good way of sorting a rating, since the abscense of negatives is a stronger indication of a positive review. I think most people actually do this kind of analysis in their mind: if something is highly … Read more

Byte arrays, typed values, binary reader, and fwrite

I was trying to read a binary file created from a native app using the C# BinaryReader class but kept getting weird numbers. When I checked the hex in visual studio I saw that the bytes were backwards from what I expected, indicating endianess issues. This threw me for a loop since I was writing the file from C++ on the same machine that I was reading the file in C# in. Also, I wasn’t sending any data over the network so I was a little confused. Endianess is usually an issue across machine architectures or over the network.

The issue is that I ran into an endianess problem when writing values byte by byte, versus by using the actual data type of an object. Let me demonstrate the issue

What happens if I write 65297 (0xFF11) using C++

#include "stdafx.h"
#include "fstream"

int _tmain(int argc, _TCHAR* argv[])
{
	char 
Read more

, ,

Why \d is slower than [0-9]

I learned an interesting thing today about regular expressions via this stackoverflow question. \d, commonly used as a shorthand for digits (which we usually think of as 0-9) actually checks against all valid unicode digits.

Given that, it makes sense why \d in a regular expression is slower, since it has to check against all possible digit types. In C# you can limit the regular expression to use ECMAScript standards which doesn’t include the full unicode subset of digits.

While I’m neither the question asker nor answerer, I wanted to share since this is something I didn’t know about before.… Read more

Minimizing the null ref with dynamic proxies

In a production application you frequently can find yourself working with objects that have a large accessor chain like

student.School.District.Street.Name

But when you want to program defensively you need to always do null checks on any reference type. So your accessing chain looks more like this instead

if (student.School != null)
{
    if (student.School.District != null)
    {
        if (student.School.District.Street != null)
        {
            s += student.School.District.Street.Name;
        }
    }
}

Which sucks. Especially since its easy to forget to add a null check, and not to mention it clutters the code up. Even if you used an option type, you still have to check if it’s something or if its nothing, and dealing with huge option chains is just as annoying.

One solution is to use the maybe monad, which can be implemented using extension methods and lambdas. While this is certainly better, it can still can get unwieldy.

What I … Read more

, ,

Bad image format “Invalid access to memory location”

Wow, two bad image format posts in one day. So, the previous post talked about debugging 64bit vs 32 bit assemblies. But after that was solved I ran into another issue. This time with the message:

Unhandled Exception: System.BadImageFormatException: Could not load file or assembly 'Interop.dll' or one of its dependencies. Invalid access to memory location. (Exception from HRESULT: 0x800703E6)
   at Program.Program.Run(Args args, Boolean fastStart)
   at Program.Program.Main(String[] args) in C:\Projects\Program.cs:line 36

Gah, what gives?

It seems that I had an interop DLL that was linking against pthreads. In debug mode, the dll worked fine on a 32 bit machine, but in release mode I’d get the error. The only difference I found was that in debug the dll was being explicity linked against pthreads lib file. Since pthread was also being built as a dynamic library, it’s lib file contains information about functions and their address, but no actual code … Read more

Determining 64bit or 32 bit .NET assemblies

I work on a 64 bit machine but frequently deploy to 32 bit machines. The code I work on though has native hooks so I always need to deploy assembly entry points at 32 bit. This means I am usually paranoid about the build configuration. However, sometimes things slip up and a 64 bit dll gets sent out or an entrypoint is built with ANY CPU set. Usually this is caught on our continuous build server with some cryptic reason for a unit test that should be working is actually failing.

When this happens, what you’ll get is a message like this:

Unhandled Exception: System.BadImageFormatException: Could not load file or assembly 'Some.dll' or one of its dependencies. An attempt was made to load a program with
 an incorrect format.
   at Test.Program.Run(Args args, Boolean fastStart)
   at Test.ProgramMain(String[] args) in Program.cs:line 36

The first thing I do here is to try and … Read more

,

Streaming video to ios device with custom httphandler in asp.net

I ran into an interesting tidbit just now while trying to dynamically stream a video file using a custom http handler. The idea here is to bypass the static handler for a file so that I can perform authentication/preprocessing/etc when a user requests a video resource and I don’t have to expose a static folder with potentially sensitive resources.

I had everything working fine on my desktop browser, but when I went to test on my iPhone I got the dreaded play button with a circle crossed out

noVideo

I hate that thing.

Anyways, streaming a file from the static handler worked fine though, so what was the difference? This is where I pulled out charles and checked the response headers.

From the static handler I’d get this:

HTTP/1.1 200 OK
Content-Type	video/mp4
Last-Modified	Wed, 15 May 2013 20:59:30 GMT
Accept-Ranges	bytes
ETag	"9077fe17af51ce1:0"
Server	Microsoft-IIS/7.5
X-Powered-By	ASP.NET
Date	Wed, 15 
Read more

, , ,

Users by connections in SignalR

SignalR gives you events when users connect, disconnect, and reconnect, however the only identifying piece of information you have at this point is their connection ID. Unfortunately it’s not very practical to identify all your connected users strictly off their connectionIDs – usually you have some other identifier in your application (userID, email, etc).

If you are using ASP.NET MVC3, you can access this information from the hub context via Context.User, but if you aren’t using mvc3 (like a .net to .net client) a good workflow is to have your client identify themselves on connect. They can call a known Register method on the hub and give you the identifying information of who they are.

At this point you have your unique identifier along with their connectionID, but you have to manage all their disconnections, reconnections, and multiple connections of the same client yourself. This post will go through … Read more

,

Tech Talk: Path finding algorithms

Today’s tech talk was about path finding algorithms. The topic was picked because of a recent linked shared to reddit that visualized different algorithms. The neat thing about the link is that you can really see how different algorithms and heuristics modify the route.

In general, path finding algorithms are based off a breadth first search. At each iteration while walking through the graph you check the nearest neighbors you update what was the calculated weight of the path to get to that neighbor. If it was cheaper to get to the neighbor via your own node (than whoever visited it previously) you update the neighbors weight to reflect that. This is pretty much dijsktras algorithm. Disjkstra gives you the shortest path cost, but not necessarily the shortest path. To find the shortest path you mark each node with who its cheapest parent is (i.e. the node you need to … Read more

, ,

Building better regular expressions

Every software developer has at one point in time heard the adage

If you have a problem and you think you can solve it with [threads|pointers|regex|etc], now you have two problems

For me, I’ve always told it with regex (and I think that’s the official way to do it). It’s not that threads and pointers aren’t hard, but more that with proper stylistic choices and with experience, they can be easily manageable and simple to debug. Regex though, have a tendency to spiral out of control. What starts with something simple always bloats into an enormously difficult to read haze of PERLgasms.

For example, I frequently wonder why in the 21st century why we still deal with a syntax like this:

(?:(?:\r\n)?[ \t])*(?:(?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t]
)+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:
\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(
?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ 
\t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\0
31]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\
](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[

Even the most seasoned engineers couldn’t tell me what … Read more

,

Previous Posts