Sessions is temporarily moving to YouTube, check out all our new videos here.

3 Ineffective Coding Habits Many F# Programmers Don't Have

Yan Cui speaking at London Functional Programmers in June, 2017
JustUploaded
 
Great talks, fired to your inbox 👌
No junk, no spam, just great talks. Unsubscribe any time.

About this talk

Good coding habits can be the difference between being an average, good, or great programmer. Does the language you use affect your habits?


Transcript


I'm going to talk on Three ineffective coding habits that the many F# programmers don't have. This was going to be a talk about seven ineffective coding habits but I just couldn't get seven coding habits fit into 25-minute slot. So in the end, I made a decision that I think more product people should do when you face a problem like this. You just brutally cut features to what you can actually ship rather than compromise on quality. So that's that. Let's start by defining what a habit is, which according to Wiktionary, is a settled or regular tendency or practise, especially ones that are hard to give up. And one's good coding habit can be the difference between being a good programmer to being a great one because they help you make good decisions with a minimal of cognitive effort. At the same time, bad coding habits makes you make the same mistakes over and over even without realising. So I was at a talk by Kevlin Henney a few years back where he talked about the seven ineffective coding habits of many programmers. After the talk, I did a bit of soul-searching and looked at my own career and all the cases where I can remember where I've exhibited those ineffective coding habits. And to my surprise, even though I was spending, say, half-half of my time between C# and F#, pretty much all the cases I can think of was happened when I was writing in C#, which got me asking the question whether or not the language that I use makes a difference to the coding habits that I end up forming. Dijkstra once famously said that programming languages have a devious influence in shaping the way we think so one would assume that eventually that faking habit is gonna translate itself into our coding habits as well. Keep that in mind as we go through the different coding habits we're gonna talk about today starting with the Noisy Code, which I'm sure we all understand intuitively as something that's a bit of a negative trait. And we all easily see that this bit of code is really noisy because of the sheer amount of comments, completely useless for example. Or how the implementation on the left has far more implementation noise compared to the column on the right, which reminds me of something that Jeff Atwood wrote almost 10 years ago that the best code is having no code at all Because every line of code you write has a cost starting from the time and energy it takes to write a code to having more code means having more surface area for you to have bugs. And, of course, having more code means it takes someone, perhaps your future self, more time to read, but more importantly, more time to understand it well enough to be able to find, identify, and ultimately fix bugs and maintain the Scorbase on an ongoing basis. Having more code also means you need more engineers to support and maintain, therefore, you need to talk to more people to get stuff done and it's a phenomenon that I think we, as an industry, has finally woken up to. And if you go to the conference in the last couple of years, you may have heard a lot of people talk about the topic of Reverse Conway's Law. So back to what Jeff wrote about. He talked about how we should instead maximise the productivity of your developers by maximising the expressive power of the code that they do write so that, hopefully, you end up with less code that would do the same thing or perhaps even more. And back to my original question, whether or not the language that I use makes a difference. Well, unfortunately, objective comparison between programming languages are really hard to come by. Fortunately in F#, we had a guy called Simon Cousins. We still have a guy called Simon Cousins. He's still with us, who was able to give us a really comprehensive comparison between C# implementation and F# implementation for a non-trivial energy trading application that took, I think, 350,009 lines of code to implement in C#. And the numbers pretty much speak for themselves whereas the C# version is over 10 times, requiring 10 times as much code as the F# version. And if we break down the numbers, you start to see some trends. Because F# don't have nulls or braces as a core part of the language, straightaway, you eliminate a large percentage of systematic knowledge that you generate because of those language, syntactic components that's part of the language. And because when you're writing in the functional paradigm, you also operate at a higher level abstraction that's further removed from the sort of plumbing that you tend to get with imperative paradigm, which means you often end up having to write far less code to implement the same things and having far fewer lines of useful code means you also need fewer lines of blanks and comments to explain what those code do. So all in all, you end up with much higher signal-to-noise ratio. And that's not even the full story. Because the C# project took, I think, five years with a maximum of eight developers and they didn't finish implementing all the contracts that was laid out in the requirement. So this was an energy trading company so the contracts are different types of the energy trades you can do. I don't really know, I'm a game developer. Whereas the F# project took less than a year and had a maximum of three developers with only Simon himself having had any prior experience with F# and they managed to implement all the different contracts in that time. It's a fascinating story and if you want to read more about it, there's a link at the bottom of the slide which I will share later on. Which brings us to our next topic, which I like to call Visual Dishonesty. In his rather excellent blog, Daniel Higginbotham talked about how "a clean design is one that supports visual thinking "so that people can meet their informational needs "with a minimum of conscious effort." And that what Daniel was talking about visual design here but I think the same principles can apply to code as well in terms of how we lay out code so that its structure and hierarchy is obvious to people reading our code without having to think too much about it. And whether we lay one line of a code with over another, with indentation, we are telling the reader that, hey, there's some hierarchy here. Then he went on to talk about how, "You convey information by the way you arrange "a design's elements in relation to each other. "And that information is understood immediately, "if not consciously, by the people looking at your designs." And that's great, but only if the visual relationships are accurate and obvious. If they're not, your audience is gonna get confused and they're gonna spend even more effort going back and forth between different parts of the design to understand, to make sure they actually understand what it is you're trying to present them. So with that in mind, when we are reading English of code, most of the time we are reading from left to right and top to bottom except when it comes to reading nested method or function course. So all of a sudden, the flow that we're reading is now reversed. We're reading from right to left and bottom-up and we've created this disparity in how we visually process the layout of your code. In F#, there's a pipes operator, which I think that's been copied to Elixir and a few other languages as well, where you can essentially use it to write, and Elm, to write a nested method course as a pipeline that flows form left to right and top to bottom. What's happening here is that the first line of your code is evaluated and its output is then passed on to the next line as the last argument for partially applied function. And this continues on so that you can see, instead of writing a nested method code, you have in a clear pipeline are all the different steps that's been done in this very simple function. And then there's the whole argument about where do we put our braces and how it's gonna affect readability of our code. If you take this very simple C# method as example, it's got indentation. It's got different arguments on each line. It probably looks pretty good, right? What happens if you squint? Now all of a sudden, well, where does the method body start and where does it end? And I know there's like an if statement here somewhere but where does the if statement's body start and where does it end? So even though we've had the indentation in place, the structure and hierarchy of our code is still not obvious without me having to eyeball the placement of the opening curly brace. Instead, what would happen if we just put the curly brace on a new line? Now we squint and the structure and hierarchy of our code is much more evident without having to track where the opening brace is. So it turns out, where you put your braces does matter and it's not just a matter of personal preference. It affects readability in a big way and there's a right way to do it, which, I think, once it's been pointed out to us, at least to me, it seems quite obviously why would you do it any other way. But it makes me ask, how did we miss it? How did so many of us end up in this argument about where do you put the braces, do you put it on a new line or the end of a line? And I think, at least when I look at this problem, it all boils down to having two competing rules for structuring your code, at least in a C-style language. You use braces for a compiler. That's all the compiler cares about. But for humans to be able to read your code, you also need to put in whitespaces. Combine that with the eagerness of many developers to make sure that they follow best practise guidelines in terms of how many lines of code should be inside a method and so whether you end up with 62 or 64 lines of code. You know what? Let's superficially cut it back down to 60. So by putting the open braces on the end of a line instead of a new line and now we're all good, and we end up sacrificing readability for the sake of readability. So what if we just take out braces altogether and just use whitespace for conveying information about the structure of our code to both the compiler and to the human? I think that once you get used to it, you're far less ambiguous compared to using a combination of both whitespaces and braces. And if you do that, we can actually stop arguing about where to put our braces and just move on with our lives. And with that, I want to quote one of my favourite lines from "The Zen of Python", that, "There should be one, "and preferably only one, obvious way to do things." And which is also one of the things that I really like about list, where the syntactic structure of your code is always consistent and always very simple, even if you have to get used to seeing all these brackets. And if you're curious, this is how I would write that just the equivalent method in F#. Even though there's no braces here, the structure of my code, even when I squint, you see very clear, very obvious. And you can also see that, because we take out all, there's far less syntactic noise, you end up writing far less code as well. And having less code to do the same thing or more is good. Which brings us to my favourite section of this talk, Naming. Because we all know naming is hard. You've probably heard this quote before, that, "There are only two hard things in Computer Science: "cache invalidation and naming things." And, of course, somewhere later on, I came along and added and off by one exception. But names are utterly important as well because they are the one and only tool that we have to explain what a variable does in every place that it shows up in our code without having to put comments everywhere. And a common practise that I'm sure we've all observed in practise is called the Lego Naming, where you glue different words together in order to try to create more meaning using common words such as service, process, factory, proxy, controller, strategy, and so on. But as you can understand, it's fairly easy to automate process, so why do it yourself? Just head over to methodnamer.com and let the system procedurally generate a bunch of names that you can use for your methods. The bad news is that this is not naming, folks. What we're doing is we're labelling things, which is not the same as naming them. Adding more words together doesn't necessarily create more meaning. In fact, more often that not, they end up diluting the meaning of the thing that we're trying to name. And that's how we end up with gems such as controller factoryfactory. And a while back, I went on to GitHub. I did a search for factoryfactory and I found an embarrassing reach of hits. Some of them is just out of this world. I've got a class here called FactoryFactory that implements a typed code, TypedGameComponent of generic type FactoryFactory of type factory. And TemplateFactoryInterface of generic type Factory and FactoryFactory of Factory. Where do you draw the line? The sad news, even worse than this, is that naming is not just a problem that's unique to us. - Backward causality, if A does B and A is successful, then if you do B, you too will be successful. So being a scientist, I thought, can we automate this process of copying? So I attended a lot of conferences, listening to people talk about strategy, trying to identify various means, or what I like to call, Business Level Abstractions of a Healthy Strategy, or BLAHS for short. So here are the common BLAHS, digital business, big data, disruptive, innovative, collaborative, competitive advantage, ecosystem, blah, blah blah blah blah blah. I created a strategy template for this. And our strategy is blah. We will lead a blah effort of the market through our use of blah and blah to build a blah, and so forth. Then I took all the BLAHS and the Blah Template and auto-generated 64 strategies. So let's go through them. Number one. Our strategy is customer focused. We will lead a disruptive effort of the market through our use of innovative social media and big data to build a collaborative cloud based ecosystem, et cetera et cetera. Strategy two. Our strategy is innovative digital business. We will lead a growth effort of the market through our use of customer focused competitive advantage and disruptive social media to build a collaborative revolution, and so and so. Number three, no, not really. So I sent these out to our local friends, put them online, and I got about 150 responses. Of three contacts, the first one was, "This is more or less the exact wording "from our business plan." The second was, "I've seen two of these used already." And the third one was, "Are you for hire?" - So that's Simon Wardley doing his keynote at the OSCON in 2014. So we can all agree, naming is hard, right? And Lego Naming, just such an easy way out. Fortunately, in F# and in functional languages, quite commonly we'll use anonymous functions, aka lambdas. And by nature of them being anonymous means that we don't have to name them. So straightaway, we can remove a lot of things that we would otherwise struggle to name to begin with. Instead, we end up with lambdas whose meaning are created by the high order functions that use them. So high order functions such as map and reduce. And since the lambdas are very short in terms of scope, so you can also get away with using shorter and generic names for the values that you declare for a lambda, which is why you commonly see names such as x, and y, and zed being used inside a lambda function, which of course upset some people. That's probably not a part of the functional crowd. And interestingly, the best defence for this practise came from none other than uncle Bob in his famous "Clean Code" book, where he talks about, "The length of a name should be related "to the length of the scope." So if you've got a for-loop, that's five lines of code, it's okay to use variable names such as I, J, and K. And I'm sure if you go back to the university course where you learned to programme the first time, that's what the name what the people use for those examples. So the same principle apply to here, where we use names such as X, Y, and zed for a lambda function that's maybe two lines of code. Another common practise that you see in the functional languages is using the tuples and pattern matching against them, rather than the carrying maybe one of abstractions with two or three fields. So again, this eliminates the need for you to name those abstractions and you end up with far fewer things that you have to name. And for example, this is five lines of F# code. It calculates a term frequency. And Seq.groupBy would return a sequence of tuples of two elements. So I can, in the pattern matching here, give the two elements very useful names. If you're coming from C#, your best bet would be a variable called maybe a tuple with the fused named item one and item two. But in F#, you can actually have the ability to pattern match against them and give them very useful names. Lego Naming is also can be a symptom for the failure to identify the right level of abstractions you work with, partially because the right level of abstraction is often smaller than an object. And in OO, we just don't have a very good way to represent a pure piece of functionality. So the common practise would be to wrap this functionality inside a class or an interface. So you end up with something that provides the functionality that you're looking for. And therefore, you have two things to name rather than one. Naming is hard, let's just go Lego Naming. So ConditionChecker, check. But if we work at this problem hard enough, we might be able to come up with slightly better names. So Condition is a better name than ConditionChecker. And IsTrue is probably just as good, if not better, name than CheckCondition. But for me, this is a much better yet is still way too much wrapping around what it is I'm looking for, which in this case essentially, is a condition is something that you can evaluate with no argument and returns a boolean. And this is how I will represent that in F#, which I'm sure everyone can agree, is far shorter and far more concise. And what's more, any function that matches my type signature can be used as a condition without having to explicitly implement my CondtionChecker interface. So that's our three out of seven ineffective coding habits that many F# programmers don't have. And through practise, we all pick up habits, good and bad. And Vince Lombardi, a famous American football coach in the 1950s, I believe, once said that, "Practise does not make perfect." And, "Only perfect practise makes perfect." And while some perfection is not attainable, if we chase perfection, we might just be able to catch excellence. And going back to programming languages, the default in your language gives you affordance towards a certain set of behaviours and habits. Functional programming, for example, or at least functional first languages such as F#, Scala, Elm, Clojure, et cetera, immutability being a default affords a habit of being very conscious about where and when things can change. And every language, and to maybe even greater extent every programming paradigm, have a huge impact on how we think and how we problem solve. And that impact is most profound with the first language that we ever learn. And since every language and paradigm forces you to think in a certain way, it also restricts you to being able to see only a small part of the solution space for any problem that we are faced with. And from personal experience, where I have spent quite a lot of time with a number of different languages of paradigms, I have to admit, it does take a lot of effort and hard work to unlearn the ways that we've been taught to see computing and to be able to expand our mind. But I think, overall, it's certainly effort well spent because it unshackles you to be able to now look at a same problem from many different angles, with many different lights, and come up with solutions that maybe you wouldn't be able to had you only known the way to solve problems with OO or with imperative. So with that, I want to leave you with a few more paragraphs from "The Zen of Python". That, "Explicit is better than implicit." "Simple is better than complex "and complex is better than complicated." And whilst special cases aren't special enough to break the rules, always remember that practicality beats purity. And if implementation is hard to explain, then it's probably a bad idea to begin with. And with that, my name is Yan and I work for a games company called Space Ape Games. You can find me at these places and you can find the slides here and thank you very much for your time.