« A First Erlang Program | Main | Test-First Word Wrap in Erlang »

April 17, 2007

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d83451c41c69e200d8341c807a53ef

Listed below are links to weblogs that reference Adding Concurrency to Our Erlang Program:

Comments

Vincent Foley

I must confess that I don't understand the parallel code :(

Where are the processes stored? It seems to me that because you are using lists:foreach, you don't care about the return value of background_fetch/1, which as I understand it is the actual process object (as created by spawn/1). So where does gather_results/1 get its processes from? From a process ether? Also, why do you need to pass the result of isbns/0 to gather_results/1 since you don't use those values in lists:map?

Any clarification would be greatly appreciated!

Vincent.

Fernando J. Pereda

Vincent, note that the messages are being sent to the ParentPID. Erlang queues this messages until that ParentPID enters a receive block and matches said messages.

- ferdy

Jo Stockley

These last two articles have got me interested in getting familiar with Erlang. I would like to know how the performance compares to other languages such as Ruby, Python and Lisp. What is the underlying implementation (interpreter, JIT compiler or what)?

Geoff Cant

Lazy erlang programmers might also do something like:
fetch_in_parallel() -> rpc:pmap({ranks, fetch_title_and_rank}, [], isbns()).

Though it doesn't illustrate spawning and message sending nearly as well, it spawns the processes round-robin fashion over all the connected nodes() :)

Vlad Balin

Jo, in general Erlang is faster than Python and Ruby, but slower than modern Lisp systems. Erlang uses both byte-code interpreter and optimizing native code compilation techniqus, you can mix both modes in a single system.

On some tasks (such as parsing binary data) Erlang performance can be comparable with C. But it doesn't actually matter, because performance of real complex system depends on a large set of factors, and these factors are not related to raw performance directly. For example, Erlang web server YAWS outperforms Apache - you can see the comparison here http://www.sics.se/~joe/apachevsyaws.html

Vlad Balin

Yeah, and at the same time - Erlang is VERY inefficient in handling strings represented "in this way", because internal representation of such strings is a list of integers. So, your string takes 8 times more memory being compared to normal character strings in other langugaes.

What we can do about that? Need performance - use binaries for manipulation with strings, and everything will be ok. :)

Aaron Tomb

Jo,

Erlang is typically _much_ faster than Ruby and Python, and faster than interpreted Lisp. Compiled Lisp might be able to compete with Erlang, though. Depending on the platform, Erlang either uses a highly-optimized bytecode interpreter or an even higher-performance JIT compiler. Most common platforms will use the JIT.

Additionally, Erlang has essentially the fastest concurrency primitives of any language in existence. While it's somewhat slower at executing sequential code than a language like C, it's _much_ faster at doing synchronization. As an example, there's a web server written in Erlang, called YAWS, which kicks Apache's butt. See here: http://www.sics.se/~joe/apachevsyaws.html

For more information on comparative programming language speeds, you can look here: http://shootout.alioth.debian.org/ Keep in mind, though, that most of these benchmarks are sort of toy programs, and the performance results may not translate to more complex applications.

anon

Eshell V5.5.4 (abort with ^G)
1> c(ranks)
1> .
{ok,ranks}
2> ranks:fetch_in_parallel().

=ERROR REPORT==== 18-Apr-2007::18:32:58 ===
Error in process with exit value: {{badmatch,[]},[{ranks,fetch_title_and_rank,1},{ranks,'-background_fetch/1-fun-0-',2}]}

Dave says: Did you add a valid Amazon key?

Tobbe

Just for the fun of it, try the debugger:
(if you've got the right gui libs. installed)

1> c(ranks, [debug_info]).
2> im(). % a window should pop up, ignore it for now...
3> ii(ranks).
4> iaa([init]).
5> ranks:fetch_in_parallel().

After the last command, you should get yet another window.
From here, single step, set break-points, study varable contents, etc.
Note: the first windows shows the processes running the actual code,
double-click on a line to attach yourself to that particual process
and you'll get the second window.

Cheers, Tobbe

Damir

Till now I never realized how easy and straightforward is to write erlang code. I followed examples and had paralel web fetcher done in few minutes.

Since I have similar app written in perl (using POE), I added some parsing and tryed it out. I'm running this on my laptop, and perl script is running on quad proc server box. It significantly outperforms the perl version.

The only problem is, my boss now know about this speedup and I can just see his brain ticking... :-)

Very nice post, thanks. I learned a lot.

Mike

If I run the timer:tc() repeatedly, the times change drastically. Sometimes it is 3 times more than the series function. any ideas why?

> timer:tc(ranks, fetch_in_parallel, []).

run1: 3757482
run2: 1162909
run3: 6826861

> timer:tc(ranks, fetch_in_series, []).

run1: 3164756
run2: 5552845
run3: 2835604

Alex Blewitt

Dave, you don't need to abstract the PID = self() prior to the spawn.

You're correct that if writing it as:
spawn(fun double(self(),Number end)
then you'd have to do this, but you can also do it by launching the function with a name:
spawn(ranks,double,[self(),Number])
Not only is that better, but it makes the code execute ranks:double() rather than double(), which means that if you wanted to, you could replace the definition of ranks:double() with a new version and it would pick up the new one. In your example, you're baking in the version of the code when the spawn argument is done.

Doesn't make much of a difference here, but if it was being called in a server loop then it would make a difference. But I can see the syntax being fiddly if you wanted to pass in lots of args and thought you had to bind them to names first.

Richard

Since the rank code goes out to Amazon for the ranking values, the response time is based on round trip.

Mark Aufflick

You can even let someone write the concurrency primitives like pmap for you!

http://code.google.com/p/plists/

is an erlang library of parallel list functions:

...plists is a drop-in replacement for the Erlang module lists, making most list operations parallel. It can operate on each element in parallel, for IO-bound operations, on sublists in parallel, for taking advantage of multi-core machines with CPU-bound operations, and across erlang nodes, for parallizing inside a cluster. It handles errors and node failures. It can be configured, tuned, and tweaked to get optimal performance while minimizing overhead.

...

This module also include a simple mapreduce implementation, and the function runmany. All the other functions are implemented with runmany, which is as a generalization of parallel list operations.

himadri

say, if you want to make this a client server how do you go about it .

This is what I tried but couldnt get it running


In the ranks.erl I do away with background_fetch/1 and do this :

set_queries_running(ISBNS) ->
lists:foreach(fun(ISBN) -> parentPID ! {ok,fetch_title_and_rank(ISBN)} end, ISBNS).

I then create a separate client.erl which has this :

-module(client).
-(export[gather_results/1]).

gather_results() ->

register(parentPID,spawn(fun() -> loop() end)),
ranks_del:fetch_in_parallel().

loop() ->
receive
{ ok,Anything } -> Anything
end.

This doesnt work . What am I doing wrong ?

The comments to this entry are closed.

Now in Beta

  • Programming Ruby, 3rd Edition
    Third Edition, Covering Ruby 1.9, now available
My Photo

Pragmatic Stuff

Photos

  • www.flickr.com
    This is a Flickr badge showing public photos from pragdave tagged with pragdave_badge. Make your own badge here.