3.26.2008

The *Real* Erlang "Hello, World!"

This *is not* it:


-mod(hello).
-export([start/0]).

start() ->
io:format("Hello, World!").

I propose that the purpose of a "Hello, World!" program is to communicate something essential about the programming language in a small space. The program above does not achieve this - relative to implementations in other languages - predominantly because it omits anything to do with the Actor model, which is a core part of what makes Erlang interesting.


I propose that the following should be considered The Real Erlang "Hello, World!":


-module(hello).
-export([start/0]).

start() ->
spawn(fun() -> loop() end).

loop() ->
receive
hello ->
io:format("Hello, World!~n"),
loop();

goodbye ->
ok
end.

Let's dissect this example to see why. To run this program, install Erlang, fire up the Erlang REPL erl.exe and follow along.

First we compile and load the program with the commandc(). Note that we omit the ".erl" file extension when referring to the module. Also note that I startederl.exe in the directory containing hello.erl such that I was not required to type in the full path.


1> c(hello).
{ok,hello}

Erl responds with ok and the name of the compiled module.

The start() function is the only function we can invoke in the hello module, because it's the only one that is exported, as per the module's export statement. This is how Erlang implements encapsulation, in that the exported functions form the public interface of the module. The list of exported functions are of the form name/arity, where name is the name of the function and arity is a formal way of saying "the number of arguments it takes".

Invoke the start() function within the hello module, assigning the return value to a variable called Pid:


2> Pid = hello:start().
<0.36.0>

The spawn function returns a Pid - a Process Identifier - which is a first-class Erlang data type. We assign this return value to a variable of the same name. (We could just as easily have assigned it to a variable namedFoo, but using Pid is fairly common). Note that variables in Erlang need to start with an uppercase letter.

Erl responds by pretty printing the process identifier <0.36.0>; all valid expressions in Erlang have a return value.

At this juncture, if you try to assign any other value to Pid, you will get a badmatch exception. Once a value has been bound to an identifier, it cannot change: Erlang is a single-assignment language. The benefits of this paradigm include the ability for the compiler and runtime to make fancy optimizations, and it also greatly eases debugging because variables are immutable.


The Sharp End


The spawn invocation starts an Erlang process which wraps the loop() function just below it. (Note that Erlang doesn't impose any order of definition on functions). Erlang processes are the essence of programming in Erlang, and the essential missing element in simpler "Hello, World!" examples. Processes are the Erlang implementation of the Actor model: extremely lightweight concurrency primitives that communicate purely by message-passing. They have nothing whatsoever to do with operating system processes, threads or similar, and are managed entirely by the Erlang runtime.

The process waits (semantically at the receive statement) for a message which matches one of its receive clauses.

We can send a message to the process using an exclamation mark (the message-send operator) followed by the message. We can see that the receive block has two clauses which match both hello andgoodbye.

We invoke the code within the 'hello' clause by sending the corresponding message to our cached Pid:


3> Pid ! hello.
Hello, world!
hello

As we expect, our process responds with, "Hello, World!". And as noted before, Erlang returns a value for all valid statements, this is why we see hello printed out immediately following the output of io:format.

The following line does a tail-recursive call back to loop(). In case you didn't follow the link and aren't completely familiar with tail recursion, you should know that tail-recursion is the bombay duck of computer science: there is no recursion going on, at least in the sense that anything is left on the stack. Tail recursion is a means of efficiently calling the current function, and is more akin to a goto or a jump instruction than the terminology would have you believe.

So, given the tail-recursive call back to loop(), the process is once again put back into the wait state. We could send the hello message to Pid ad nauseum and the process would simply repeat.

Now we send the goodbye message:


4> Pid ! goodbye.
goodbye

The crucial difference between this clause and the clause that matches the hello message is that this clause does not include a tail-recursive call back to loop(). As a result, the process effectively dies. We can confirm this by attempting to invoke the code in the hello clause once again:


5> Pid ! hello.
hello

And we see that no output is generated.

The last important detail that I have omitted is the type of hello and goodbye. These are erlang atoms, an extremely simple data type whose value is itself. Atoms are used heavily in message-passing (and other pattern-matching contexts) and are very easy to work with: you simply declare and go!


Re-Entry Checklist


Although the explanation has been verbose, I hope you agree that this Erlang "Hello, World!" communicates some interesting essentials of the Erlang programming language. These essentials concern in particular how Erlang implements the Actor Model, which is the kernel of its message-passing semantics and a key enabler for Erlang's capability for massively concurrent processing.



19 comments:

Anonymous said...

To be honest i find this rather veborse...

shouldnt simple concepts stay simple even when they are translated into a new language's ecosystem?

Anonymous said...

Thank you for 'tail recursion is the bombay duck of computer science'.

I doubt I'll ever properly attribute it, but I'll remember that forever.

keithb said...


"I propose that the purpose of a "Hello, World!" program is to communicate something essential about the programming language in a small space."


Interesting proposal. It's not what "hello world" was invented for, though.

The original purpose of "hello world" was to have a suitably trivial item with which to check that one could successfully write, save, compile and execute a program at all in some new and unfamiliar programming environment. A new OS, or such like.

"Hello world" began as a smoke test for your ability to do any software development at all, and not a language tutorial. The point being that the program itself should be about the least complex one that would actually cause an observable side-effect in the world, so one could focus on the oddities of the environment.

Anonymous said...

Thanks for the great intro to Erlang! This was simple yet interesting enough to make me download Erlang and give it a try.

Robert Virding said...

Unfortunately you are forgetting one thing, while the call to io:format may look like your typical "run of the mill standard sequential language do it all in one process function call", internally there is enough concurrency and actor stuff going on under the hood to make most people happy.

Edward Garson said...

@anonymous:
"Thank you for 'tail recursion is the bombay duck of computer science'.

I doubt I'll ever properly attribute it, but I'll remember that forever."


Thanks for your kind comment. "You heard it here first!"
---
@keithb:
"Interesting proposal. It's not what "hello world" was invented for, though."

That's what "Hello, world" may have been invented for, but that has changed. In this day and age, we can reasonably expect the activities you cite to succeed without inordinate effort on the programmer's part.

I would say that "Hello world" has now rather become a means to understand how to perform those activities, and to provide some minimal insight into the programming language.

To quote Wikipedia (which we all know is the definitive source of All Things Correct ;-)

"Experienced programmers learning new languages can also gain a lot of information about a given language's syntax and structure from a hello world program."

From Hello world program

---
@anonymous:
"Thanks for the great intro to Erlang! [...]"

Glad you enjoyed it. Happy travels in Erlang-land!
---
@Robert Virding:
Unfortunately you are forgetting one thing [...] there is enough concurrency and actor stuff going on under the hood to make most people happy.

Great to have you by, Robert.

[Readers: Robert is the author of an Erlang book that predates the Prags book. And the author of Lisp-Flavored Erlang, a lisp syntax front-end to the Erlang compiler. That means he totally kicks Erlang ass.]

Thanks for the heads up; I wasn't aware that io:format() was so sophisticated under the covers.

However, I wouldn't expect most hackers to read the implementation of io:format() any more than reading the printf() implementation for a "Hello world" in C.

It's certainly a good idea, one that now I intend to follow up on.

Thank you for your comment.

Anonymous said...

It's also a great way to demonstrate memory leaks, since every message sent to Pid that isn't 'hello' or 'goodbye' will get stored in memory forever, or at least until loop() is shut down.

Immo H√ľneke said...

Hi Edward, I would suggest making the example clearer by not giving the same names to functions and atoms. For example, call the atoms that hello() accepts "greet" and "leave" instead of "hello" and "goodbye" (and another benefit of that is that you can have the function print "Goodbye world!" to the screen when it receives the "leave" message).

Edward Garson said...

@anonymous:
"It's also a great way to demonstrate memory leaks [...]"

Yes, that is absolutely true. Messages that do not match any of the receive clauses are moved to the process' save queue. When the process next receives a message, and if that message matches, then all the messages in the save queue are put back into the mailbox in the order in which they were received and reprocessed. The reason why is that the next received message can mutate the process in such a manner that messages that did not previously match would subsequently match (think guard statements for one).

The save queue works this way to enable flexibility with regards to message processing. And of course you would not design your system to send messages without expecting them to be processed at some point!

In any case, I could have added a third catch-all receive clause:

io:format("Neither 'hello' nor 'goodbye' received...")

Thank you for pointing this out.

ppolv said...

I would see it more like a
"hello world in concurrent programming"

or even, with minor modification "hello world in distributed programming"

Dhanesh said...

Best Introduction for Erlang I ever seen.Thanks for it..you rock

Anonymous said...

Thanks for this small but useful article. I was trying to understand the Pid and message sending from long time and after reading this, I really got it.

Thanks a lot

cometarossa said...

Really an easy example, thanks

Stu Thompson said...

Thanks for the detailed explanation of "Hello World"...It was a perfect introduction for myself, easing me into the world of Erlang. Not to shallow, not to deep.

Stu

Kenneth Larsson said...

Thank you! Good approach.

Anonymous said...

I believe that this is definitely a good hello world tutorial for Erlang. With the number of languages and the increasingly large amount of features they have, it is better to have a more verbose hello world that demonstrates more of the language syntax as opposed to a simple print statement.

I propose that hello world be broken into a 2 step process. The first step would be following the original hello world standards, as in just printing something. This shows the basics and how to compile something in the language. The second step would be what you demonstrated here, a much more in-depth explanation of the language that has much more practical value to those who may actually use the language and stick with it.

I definitely found this to be of value and was much easier to understand then another Erlang tutorial I previously found.

-Nicholas Bevacqua

Anonymous said...

I just installed erlang and your hello world program is going to be my very first peace of code written in erlang. Thanks for the tutorial!

RogerL said...

Notice that the simple

3> Pid ! hello.
Hello, world!
hello

can sometimes result in

5> Pid ! hello.
hello
Hello, world!

Why? concurrency! There is a race between printing the result of the io:format and the shell printing the result of the send expression...


Next feature to add is doing a software upgrade on the running process!

Anonymous said...

Most excellent! The idea of clearly demonstrating the mechanisms of Erlang's message passing, iteration, and decision processing done in a strong yet simple example.

And, the fine quote, "tail recursion is the bombay duck of computer science" makes this one of the better short intros to Erlang.

Thanks!!!