As a tribute, I thought I would share a recent example where this principle would have settled an argument on Stackoverflow, and perhaps not lead to me using two unwitting souls for fodder in this blog post.
The question itself was pretty trivial. "How to select List<> by its index and return the content?"
One of the answers suggested using the ElementAt extension method as an alternative to using the standard indexer. No big deal, but what ensued was a pretty fun little comment argument. I've changed the names to protect the innocent:

So Bob is making a legitimate claim that using ElementAt would be inefficient compared to using an indexer. Unfortunately he also makes a pretty wild claim about ElementAt not returning the same value that calling the indexer on a List
Larry then decides to defend himself in a way that every programmer has probably done at some point; he writes a small test program. You can see the code here: http://ideone.com/H0A2X
Despite the test program being fundamentally flawed (see fixed program here: http://ideone.com/hooQB), Larry got the numbers he was looking for. Armed with a helping of confirmation bias, and his new numbers, Larry decides to respond back to Bob.
And what else did Larry have to say about his results:
"When it comes to errors, the MSDN documentation and a wealth of empirical evidence suggests that such concerns are wholly unfounded... The ElementAt() method is definitely faster... I have tested Lists of type double, float and string with similar results..."
In the end... both of these guys are wrong!So what is the truth? What does the source code say? Let's use DotPeek and find out:

Turns out that ElementAt is optimized for anything that implements IList<T>and simply calls the indexer. A 30 second look at the source code would have ended this argument before it started. In reality ElementAt is 3 times slower when called on a List<T> because of the cast.
If you need to know how something works, then you need to go to the source code and see for yourself. Without this skill, you will never know for sure how stuff works. Don't waste your time arguing about how you think things work when the answer is right at your finger tips.
Follow Jeff and Brandon's advice and Learn to Read the Source, Luke
Nice post. Hope "Larry" and "Bob" reads this :D
ReplyDeleteI have to listen to arguments like Larry and Bob's pretty often. The argument is always the same;
ReplyDeleteA: "I wonder how that works?"
B: "I think it does this..."
A: "Really? I would have thought it worked like this..."
And so on for a few minutes, before I interject with "don't speculate, find out".
I firmly believe that speculation, particularly in software development, is a huge time waster. I don't see why anyone would do it.
But it happens A LOT. I'm constantly amazed at the number of developers I run into who believe something completely erroneous about some particular framework, or language feature simply because they "heard" it was slower, inefficient, etc...
DeleteThis is usually on display the most in cross-language arguments.
...and also a good argument for avoiding anything where you *can't* look at the source.
ReplyDeleteThe problem with this is that it's terrible design. You don't build code to an implementation you build it to a standard. There is zero guarantee that in the next version of the library the code for elementat will remain the same. The second guy from stack overflow is right, you write code to the spec unless you have no choice, because it means your code continues to work. Reading the source is certainly helpful, but you don't code based on what you find there unless you have no choice.
ReplyDeleteI'm not sure how understanding how the underlying code you are using is "terrible design" Part of the success of OSS is that people are liberated, and encouraged to dig into the underlying sources for themselves.
DeleteAlso, I don't really think the point of this was that using ElementAt() was good or bad. The point is that both of the participants made erroneous claims about performance and behavior without really knowing. There was a lot of wasted effort when they could have just said "Hey let's look at the code!"
There's nothing wrong with understanding your code, or your library or quantum mechanics, knowledge is good.
DeleteMy point is that coding should be done based on what the code says it will do, not what it does if that makes any sense.
Microsoft could change the implementation of ElementAt tomorrow and so long as the method still does what it says on the tin, they don't even have to document it. Lists by specification make no promises with regards to order, the fact that you can generally can count on their order in most cases is an implementation detail.
Essentially what I'm saying is that looking at what the code does doesn't actually solve the above argument any more than the posters timings do. There's no part of the documentation it spec which promises that behavior and it could literally be changed on a whim. Microsoft are unlikely to do that, but the sane is not true of all vendors.
What Chris said.
DeleteReally, this is actually one downside of open source. Folks go look at the source and code to that, and then cry foul when the next release of the framework breaks their implementation-dependent solution.
Does anyone remember the principles of "object oriented" programming or is that term just a dirty word now?
Exactly.
DeleteI wholly agree that it's important for developers to be able to read code and that it can solve many problems more quickly than any other way.
However, the important bit of what Bob said above was: "I'm stating that the interface itself does not provide any degree of ordering, thus it is *semantically* wrong." Implementation depends on interface, not the other way around. Even if List<T>'s implementation of ElementAt() uses the indexer (which, as Chris suggested, in the future it may not), the IEnumerable<T> interface does not guarantee ordering, which means that calling ElementAt() on any other implementation of IEnumerable<T> does not guarantee ordering, and you'd have to check the source for every single class that implements it if you want to depend on ordering. What's better is to find a way that you can check once, and depend on forever.
TL;DR: IEnumerable<T> itself does not guarantee ordering. List<T> is currently properly ordered, but could be changed at any moment, since it only needs to follow the spec set out by IEnumerable<T>. Checking source is good, it helps you learn about how things work, but you can't depend on it. Building good software means building something that can withstand minor changes in underlying platforms.