Sessions is temporarily moving to YouTube, check out all our new videos here.

Thinking Functionally 2

John Stovin speaking at dotnetsheff in September, 2017
28Views
 
Great talks, fired to your inbox 👌
No junk, no spam, just great talks. Unsubscribe any time.

About this talk

To think and program functionally, you need to reassess many of the thought processes and habits that you have developed for coding in curly bracket languages, and understand a new set of idioms and ways to think about code. In this session, John will talk about some of those fundamental idioms, explain why they exist and how they interact to provide a very different way to develop software on the .NET stack.


Transcript


So I've been playing around with F# for a while and learning about functional programming, and there are a lot of things that apply to that, that are equally useful and applicable in other languages too, so I'm just gonna, this talk is more really about how you can take those things and apply them into particularly C#. So yeah, I like shiny things. There are lots of shiny things in Sheffield. That slide was in there because a lot of people I gave this talk to before didn't know what Sheffield was like. So yeah, I've been learning F#. I find I learn by teaching stuff. The best way of learning things, I find, and making them stick in my head, is by teaching other people and it also teaches me, gets me to understand what I understand, what I know and what I don't know. So at the moment, I'm kind of learning F#. I'm in no way an expert. There's lots of bits of the language I don't know. There are certainly lots of stuff about the theory of functional languages and functional programming that I would like to know more about and I'm gradually catching up with. There's a huge body of knowledge there. I mean, in my kind of development career, I tend to lurch from from impostor syndrome to Donnie Kruger and back again, and this is definitely the kind of impostor syndrome talk I really... This is a few things I've learned, this is not me as an expert by any means. So if you know better than me, shout out, I would be grateful. If you don't, then fair enough. I started out as a C and C++ programmer way, way back in the day. You can see I'm not as young as I used to be. And when I moved from C to C++, it took a while to get the hang of this object-oriented stuff. It was all fairly new back in those days, in the '80s. And I found also learning functional programming that again, it takes a big paradigm shift. You have to change the way you think about things in a lot of ways in order to write really good functional code. It's very tempting in F# just to fall back on the stuff you know from C#. You can do it all in F#. But what I've been trying to do is learn how to do it properly and I'm trying to sort of pass that experience on. So this is a quote from way, way, way back when Fortran was the thing. "The determined real programmer "can write Fortran programmes in any language." And I'm sure we've all come across programmers who still write basic in any language they can. So yeah, I realised I needed to understand these new approaches to writing code and send and then bring that stuff back to my, I found that I can bring that stuff back to my day job in C#. I wonder if anybody has a job writing F# in Sheffield. Let me know, thank you. Not come across one yet, but it'd be nice. So a bit of history. Functional programming, functional languages have been around a long time. They have a strong theoretical background. Alonzo Church invented the Lambda Calculus back in the 1930s. He was Alan Turing's PhD supervisor. So he's an important guy. And Turing also showed that Lambda Calculus and Turing machines are mathematically equivalent. John McCarthy invented Lisp in the '50s, which was arguably the first functional language. Then 1977, John Backus wrote a very important paper. Can Programming Be Liberated from the von Neumann Style? He argued that writing code using loops and iterators and things like that was to close to the machine and we needed more abstraction. We're still searching for that abstraction, I think. So early serious functional languages were probably Miranda. I did a bit of Miranda when I was at university, and then Haskell came a bit later. And now we have a lot of functional languages. Scala, F#, Clojure. A lot of these things run on virtual machines or of other languages. For example, Clojure runs on the Java VM. F# runs on the .NET VM. Just one thing I was gonna say. I think there's a lot of interesting functional languages now because the thing they give you is an extra level of abstraction. And back in the '70s, when people were first starting to think of this stuff, the hardware wasn't really able to work, to deal with that level of abstraction and run efficiently. We've now got powerful enough processors that actually we can work at that level and have the compiler and the language infrastructure deal with a lot of the stuff or worry about a lot of the fiddly detail for you without having a big performance hit. I think that's why things are being, things are moving in the functional area at the moment. So there's a few things I'm gonna talk about. I'm gonna start with functions, immutable data, data types, recursion, lists and sequences, high-order functions and so on. I'm sure you can all read the list. All these topics kind of build upon each other. When I was writing these slides for the first time, it helped me realise the links between these things, and I'm hoping I can kind of give you some of that, aha, that's how that all fits together kind of thing, because it's not always clear if you just read articles on this stuff as to why this stuff is important and how all these things fit together. So the first thing about functional languages is you try to write, we try to write pure functions. So pure function is a function that has no side effects and has a valid output for every possible input. So when we say no side effects, exceptions count as side effects. So an example of a pure function would be the sign function. Every possible value that you can pass into that function will give you a valid result. And keep your functions small. We all know that. I mean surely everybody knows that these days. It's much easier to understand small functions and compose small functions than it is to write big functions. They're easy to test, they're easy to understand. And I find also when I'm writing functional code, because I'm trying to write pure functions, you reduce the amount of testing you need to do because if you've got a function and you know that every valid input has a valid output, you don't need to test for all the side effects, for all the weird, oh dear, that functions is just throwing an exception kind of stuff. So try and keep your functions pure. There are ways to do it, which I'll cover later. So use small functions and compose functions to build more complex behaviour. So plug them together. That's how it works. And in a good functional programme, everything is a function. The next kind of cardinal rule of functional languages is that data should be immutable. So you cannot, once you've created a thing, you cannot change it. You can turn it into something else. So if you have, I'm trying to think of a way of explaining this. You can mutate an object but what you do is you return a new version of that object. So it's a copy with a different property in some way. So you don't reach inside your objects and go I want that string to no longer be missus but mister, for example. You create a copy of that and you return it with the value changed. There are lots of good reasons for doing this. So I should say that as far as an external observer is concerned, let's say if you're calling off this function, if you're looking from the outside, you pass in a value and you get a new value out that is different. What goes on inside is a different matter. That's what's meant by mutability. So in most functional, and in fact all the significant functional languages, you have to stop thinking about variables, and we call them assignments, single assignments. So in F#, there's the let statement, which assigns a value to an identifier. So I can say let X equals one. But if I then do let X equal two, that's a compile time error because I've assigned a new value to an immutable value. F# does have mutable, I'll come to that in a moment. So changing things, copy and modify is an anatomic operation, it happens in one go. Again, it improves readability. It makes it, I find it again it makes it easier to understand code because you're not worried about some other thing changing your data underneath you. So yeah, that's immutable data. Sorry, got slightly sidetracked. So one of the great advantages of having immutable data is that it is thread safe by its very nature. If you can only assign a value to something once and never change it, there's no risk that some other process on some other thread is gonna come along and change it and then you don't get all the issues that you get with having locks and mutexts and all those sorts of things. So it makes thread-based programming multiprocessor, using modern architectures a lot safer, a lot easier. You don't have to worry about all that tedious stuff. So F# does have a mutable keyword. It's horrible and ugly because you don't wanna use it. This is assigning of value. This is creating a true variable. It's saying X is mutable and then I can come along and assign four to that value later on. But that's a big code smell. You really, really, if you're writing F#, a lot of F# that looks like that, you're doing it wrong. So how can we apply principles of immutability to our C# code? Well, here's the sort of code that you might see quite often. Here's a simple two-dimensional point. It's about the simplest thing I could think of, they're illustrated, the example, and it has one operation on it. There's a translate operation that lets me move it from one place to another by adding an offset to that point. Does everybody understand what this is doing? Put your hand up if you don't. Okay. So there is lots of reasons why this is nasty. I mean, let's go back and look at it. So if you translate it, you're changing the value as you do it. So there's a risk that you might actually get a context switch in the middle of this and then something else might look at this and find that half of this value has changed and half of it hasn't. So there's all sorts of ooh, that's good. This is getting more and more psychedelic as we go on. And also, we've got getters and setters on this. So something could assign a value to X while we're actually translating that value if we're on a multithreaded system. There's all sorts of nasty, nasty risks with that. But this is how I was kind of thought to do it when I started doing C++, keep everything immutable. So we have a constructor, which assigns the values, but we can't assign, we can't set the values through the properties of the class. We can only get them out again. Reading is always thread safe, so that's fine. And our translate function, instead of changing the values in our class itself, it returns a new version with the new values so that anything that's holding a hand, holding a point, this object isn't suddenly find that its values get changed underneath it. So this is the kind of approach I always like to see when I do code reviews. And I always find a lot of the stuff I see with especially using DTOs, everyone seems to put getters and setters on everything in DTOs and it just scares me. But anyway, that's a kind of side point. Oh and by the way, in C# 6, you don't actually need the setters. So another thing functional languages generally don't have is NULL. Tony Hoare, another great computer scientist of history who was one of the fathers of ALGOL, he said it was his billion-dollar mistake. Because you have nulls in databases, they ought to have a NULL in their language to match. I suspect it's worth, I think it's cost rather more than a billion dollars by now, I suspect it's a multibillion-dollar mistake. We've had so many other languages since ALGOL and all the curly bracket languages allow nulls. You would've thought that by the time we got to Java and C#, people would've realised that, that wasn't a good idea. There's not much you can do in C# to avoid nulls. It gets really horrible when you have to check for NULL everywhere. It's almost as bad as async where you have to put async everywhere once you make something async. Everything has to be async. If you make something nullible, ebt has to be nullible and you have to check for everywhere. There's a few rules of thumb. Don't use NULL as a default value for things. Look at things like the NULL Object pattern so that at least you turn a value which has a meaning and which won't cause exceptions to be thrown just because you forgot to check for it. Possibly use code contracts or put NULL attributes on function parameters so at least you can get the compiler to check for this stuff rather than leaving it till runtime. I mean, that's one of the really bad things about it, is that it'll go wrong at runtime and the compiler doesn't spot it. I'm not gonna go into detail about option types. That's kind of a bit later, but the idea with an option type is that you can either return some value or a non-value, which makes it, certainly in most functional languages, means that you can pattern much on it more easily. We don't really have, we have a bit of pattern matching now in C#, but it's not as good as what you've got in F#. Again, more on that later. Next. Types. The third kind of leg of the functional stool is type algebra, is understanding the types of everything you're using and using types to convey meaning. It's very tempting particularly when you're writing things that, I kind of talked to databases and just throw text on the screen or something, to make everything what's known as stringy typed, just pass everything out on strings. The risk of that is really, really... You end up in a mess if you do that. The more that you think of using types to represent your data structures, the clearer it makes your code but the easier it becomes to read, makes the intent clearer. So if in doubt, create a class or a struct, give it a name, put what you need in it, transform types into different types because again, it helps your logic and it helps other people who come along later to understand what your code is doing. F# has a lot of type support that's just not in C#. Although C# has tuples, the syntax is horrible. Without the pattern matching in F# using what you might call anonymous types like tuples and records is really quite hard. But you can do it. But the best thing to do is at least wrap your fundamental types up into more complex types with meaningful names. Okay yeah. As I say, too much C# code uses naked strings and ints. It's all stringly typed. I've just said all that so... And again, coming back to what I said earlier, if you're using NULL type or a NULL object, you can convey the meaning, you can still have a real object that won't throw exceptions but convey some meaning that this value is invalid in some circumstances. I'm gonna show you a bit of F# code here just to show you what you're missing. So this is a discriminated union. What this says is I want the type of, you might look at this and say this looks a bit like inheritance but it's much more forgiving than inheritance because it says my shape is either a rectangle, and a rectangle is two arguments, a float and a float, which let's call it yeah, that's the height and that's the width, for example. Or my shape's a circle, in which case it has a radius. Okay? So this is an inheritance because these two, the rectangle and the circle are quite clearly unrelated to each other. They don't have really any properties in common apart from the fact that we've decided to call them to say that they're both shapes. So this means that I can, for example, create a strongly typed list of shapes and then iterate across it and work out the area of each shape in that list by applying this function to it. So this says finally the area of my shape called S. I should say F# does type inference. So it says because I know here S can be either a rectangle or a circle, I can infer from this statement which came before that S must be a shape. So I don't have to declare that it's a shape. And this says to find me the area, if S is a rectangle and therefore it has an argument, two arguments, X and Y, then multiply those two together and return them and that's my area. Or if it's a circle, take the radius, square it and multiply it by pi and give me the area that way. So I've missed the last line out. So I've declared a rectangle here and a circle here and I can apply area to rect and I get its area. And I can apply area to circ. So there's no way really, I mean you could kind of fake this in C#. You could probably have an empty interface or something called I shape, which would at least let you construct lists of these things. But it would feel a bit messy and a bit... You can see how you can write very clear but concise code in functional languages because you have these extra facilities. So again, to go back to our discriminating union, we can do pattern matching. We can say if a particular object thing is of a particular type, then we do something with it. If it's something some, other type, we can do something else with it. We've now got this in C# 7, I think. Yes, we have. You can now do a switch statement based on the type of the object passed in. It's still not as advanced as the functional version because you can't pull individual bits out of your object at the same time as you match it. I'm not gonna go into detail how that works. And again, the thing that... This shape is an additive type because it can be... It's this type or it's this type. So it's that type or the other type. Rectangle is what we call... I forget what the other type of type is but I forgot my terminology. So we can either say it's a type and a type together. Yeah, multiplicative type, thank you. Or we can have additive types, which is one or the other. Again, we don't have any kind of additive types since C# really. You can fake them. You've got type of... You've got inheritance, but it isn't really additive tyeps. It's not quite there. So it's all a bit messy and I find if you're doing stuff with complicated type structures, it just gets messy in C# and you have to write a lot of boilerplate. Although I think I haven't really had a chance to use it much in C#, but C# 7 does now have the overload is operator in switch statements, which means that you can match on types of things and you can use var to cast things to a most derived type so that if you have a list of base types, you can cast them up to the particular type that you're looking for. But there's still a lot that's missing. So we all write loops all the time in ordinary curly bracket languages, but the problem from a functional point of view is we've already said earlier one of our fundamental tenets is we don't have mutable values. We only have single assignment. So you can't write a loop unless you have a mutable variable because you have to assign the loop counter each time you go around the loop. So the way you break that, the only way you can break that is to use recursion. Functional programming geeks would say recursion is often simpler and more elegant. I think, as I say, beauty is in the eye of the beholder. Once I started writing recursive functions, I realised that it probably is true but it can take a while to get your head around writing recursive functions. One of the nice things about most functional languages is that a lot, in most cases, the recursion is done for you because you can use what I call higher-order functions which are functions that take another function as an argument so that the higher-order functions will do the recursion for you and return you some sort of list or sequence of stuff using the function that you passed in. So this is the recursive version of it. So what we're trying to do, as I say, is to get rid of this because this is a variable and we don't want variables. And the way we do that is that we use the stack. So again, we get the enumerator of our sequence and then we call our separate function and we pass in our initial count of zero. And then each time we try to move to the next item, if we fail, we return the value that we've already found. If not, we call the function again and we return. So this isn't an assignment. This is creating a new value on the stack, which is this value plus one. And then when we get to the end, the stack will unwind and we end up with, we've passed this value all the way back down to the stack. In F#, if I write a function like that, I can actually write it a bit more cleanly in F# again. And will actually take that and go ooh, that's tail recursion. The last thing I do is increment the value on the stack so I can, it's safe. It will say it will turn it back into a loop and you might say, well, what the hell's the point in doing that? And the answer is that it only does it in the case of tail recursion, which is provably safe. It will actually warn you if your function is, I think it will warn you if it can't do it. And also, whoops, I'm going the wrong way. The point is that it's not, the idea of single assignment is not really a machine-level concept, it's a human-level concept. The idea is to make sure that you only do assignments to values at points at which if it's safe to do it at points at which where it's an atomic operation. And the compiler can work that out a lot more easily than you can. So yes, C# won't unwind recursion into loops. So if you are gonna write recursive functions in C#, make sure that you are absolutely confident that your function will terminate because otherwise you're gonna get runtime exceptions, stack overflows. So we've talked about sequences and another big idea that's come from functional languages and particularly all the way back to LISP is the idea that we can talk, we can think about things in terms of lists and sequences. And lists are, in themselves, recursive structures because you've either got an empty list, a list of nothing, or you've got an element and a list. So you've got an element, which is the head; and the rest of it, which is a list, and the list is either empty or an element and a list and so on ad infinitum. By the way, does anybody know what the B in Benoit B. Mandelbrot stands for, what his middle name is? Apparently, it's Benoit B. Mandelbrot, but there you go. So you can recursively create lists by sticking items on the front of the list, an existing list, and you can do that recursively. And again, you can deconstruct lists by taking things off the front recursively. So we've already said we want recursion because we don't want multiple assignments. So we've now got a data structure that allows us to use recursion in a safe way, so that's why functional languages like lists and that's why LISP stands for list processing because it was all lists pretty much. So we also have the idea in functional languages that functions are first-class structures, first-class objects in their own right. So I can pass a function to another function. And what's more, I can pass around a function that hasn't been given its arguments. So I could, for example, a simple example, I could create an add function that takes two arguments, add X, Y. I could create a closure, if you've done any LINQ, you'll have some idea of what a closure is and create a kind of another function which I've just passed in the first arguments so I've got a function now that says add two to something. And then I can hold onto that thing as an object in its own right and then pass different values in it and each time, so I can add, I can call that with the argument three and get five and I can call it with the argument two and get two because I've built a closure around my original function. So by being able to pass around functions and closures on functions to other functions, we can build up an awful lot of complexity. I'm sure if you, who here hasn't used LINQ, I hope? So if you used LINQ, you've created a lambda function, which is just a function with no name. It's an anonymous function. You could also create closures in LINQ as well. So another useful feature, a thing to use, think in terms of functions, and also on a side note if you've used LINQ and you ever do something that requires dealing with streams of inputs, so stuff being pushed to you rather than pulled from you, have a look at Reactive Extensions, if you haven't already. They're wonderful. But that's all I'm gonna say about that. So to get back to where we were before, collections. We have two basic collection types in F# and in mostlists and sequences. Lists are like pretty much single, well they are. They're a single link. They're implemented as a single link list. So you can always pull the thing off the front of it and you've still got a list at the end. Sequences behave very similarly, but the underlying implementation is instead it's an i enumerable. There are times when you want one, there are times when you want the other. Generally, you use a list unless you need a seek. The nice thing about these types are that we have a lot of built-in function libraries that we can apply to these because we can use lambdas as an argument. So the simple cases are some sort of filter. These are F# and generally functional names. All other names tend to vary in different languages. But we have a filter function so we can pass a predicate, which is an argument, a function that returns, that takes the element of the type, an argument which is of the type of the element of our list and returns bool. And if we pass that and our list to this function, the function will return a list that contains only the items for which that function returns true. In LINQ, that's where. A map takes is a function that takes one argument, which is of the type of whatever our list contains; and returns a value of some other arbitrary type, whatever type you like. And it will transform each, it will iterate across your list and transform each item in that list into the type that you require. So again, that's LINQ, that's select. So again, if you use LINQ, if you're using, or collection operations, this is almost certainly more efficient than stuff that you're gonna write in a loop. So if you're writing C#, use LINQ. If you're writing F#, use lists and sequence operations. There are lots and lots more, and there's some really clever complicated things you can do. There is the fold operation that allows you to take a list and iterate over it and collapse it to a single value. There's an unfold operation that goes the other way so you can take a rule and generate a list from it with unfold, which is really nice. I use it a lot. So yeah, in C#, use LINQ. Use it everywhere. If you find that youfor or for each anywhere, slap yourself on the wrist and say can I do this in LINQ? Oops sorry. So LINQ's functional. It's composable. So you can build it up from simple rules. It'll work on all sequences, so yeah, use LINQ. Yeah, as I said earlier, use Reactive Extensions because they give you composability over incoming events in the same way that LINQ gives you composability over i enumerable. It's great for UI code, it's great for handling anything where you've got streams of data coming at you and you need to deal with it in a simple and clear and understandable fashion. My last thing I'm gonna talk about, and I think this is a really useful thing, a technique that you don't often see used in curly bracket languages, I think it's a really useful technique especially when you are performing a sequence of operations on a set of data, any of which could fail; and where what you want to do is continue performing each of those set of operations in turn unless one of them fails, in which case you want to bail out at the earliest possible opportunity. So again, this is some F#. The basic idea is that any function that we want to use in this approach will return this result type. This dash says that TSuccess and TFailure are generic types. So this is a generic. You can think of this in some ways as a generic class which can take any two other types as its arguments. So this will have either a value of success and its type will be, it will contain a success object or whatever you want to determine as a meaning, will convey that meaning of success, usually the result of your computation. And some sort of failure type, which again is a generic type but you usually put the error message or an exception or something like that in there that explains in more detail what your failure is. So you can think of your function as a set of railway points. For those of you who ever had a Home Bee set when you were a kid, that should be fairly clear. You pass your argument here, and your function either returns success. And the success contains the successful result. Or it fails and it returns a failure result. And then you can, by using a bit of functional pluggery, you can join them together. So at any point, you pass an argument in here. If it fails, you follow the failure root all the way along. If it succeeds, you unwrap the success value, you pass that value into the next function and so on down the line so that when you... You basically pass your initial argument at this end, and at this end you either find that you've gone all the way through and you've got a success value with your final expected value at the end. Or something went wrong, you don't really care what but you've got some sort of information that tells you what went wrong and that you can use to work out what to do next. There's a really nice, I forget the name of it but somebody has written a C# library for doing, I think it's called Operation. So there's a basic... There's a generic type in C#. Rather than result, they've called it Operation. There's everything in there to let you do railway-oriented programming over... With async methods and pretty much everything you should possibly need. Go check it out, I think it's really nice. So if you're ever doing anything in the pipeline, it's really, really nice. You don't end up having lots of if then statements in your code. You can just go join all your methods together and pull the result out in the end. So there's lots of stuff I haven't talked about but I've just tried to give you a flavour of some of the ideas I have been thinking of. Well, I mentioned in passing partial application. All these other operations on lists, sequences and other types, I haven't even began to mention monads although if you do dig into it, you'll find that we have talked about monads, I just haven't called them that. So just a few takeaways really that I'd like you to think about at the end. It's hard to think about code when you've got a multi-core, multi-threaded environment. By giving you extra levels of abstraction and a few rules that you have to stick to, you get a lot of benefit from functional approaches. You can take those abstractions and apply them back into your non-functional languages. You can use this stuff to write cleaner code, whatever language you use. And I think the determined programmer should write functional code in any language. I think that's perfectly legitimate, just don't write Fortran. If you're interested in any of this stuff, The F# community is really, really friendly. There's a great Slack channel if you want to get with a the beginners' channel. This is a brilliant website for getting started on. The F# Foundation, go and look at them, they're really good. There are lots of people who really want to help you learn and understand all this stuff. This is a really good introduction. It's a bit Java-centric but it's still a good introduction to the stuff involved. So yeah, that's all I have to say but go and explore this stuff, it will improve your coding. Thank you.