Distributed Evolution for Improving Shellsort

created Saturday, 28 December 2002
updated Tuesday, 11 February 2003

Cite: Please cite with a form similar to that of the entry for this article in its own bibliography ([]).

1 Introduction

In [], I described how I used a genetic algorithm to find new sequences of increments for Shellsort that are superior to the sequence proposed by Sedgewick ([], [], & lots of other books about data structures).

That implementation of the genetic algorithm took only minor advantage of parallelism-just three computers in my apartment.

Now, I would like a new, simpler implementation that takes better advantage of parallelism. That would allow the genetic algorithm to use a larger population & scan more generations.

Herein are my design notes.

2 Client/Server Structure

I'll have a client/server structure. I run the server, of course. The clients are any computers that are running a custom client application.

I guess anyone can run a client, though that might create issues of trust. So maybe I'll keep client membership as private.

The task of a client is to retrieve a sequence of increments for Shellsort & an array. The client parameterizes Shellsort with the sequence, uses it to sort the array, & counts the comparisons & data copies of the sort operation. Then the client reports the result to the server & retrieves a new sequence & array.

To reduce the hassle for users, the client should be easy to install, like a Perl client in a single-file or maybe even a Web page with an embedded client-side script or applet. When there are problems connecting to the server, or when the server is too busy, or when the server temporarily has no work, the client program does not raise an alarm. It quietly accepts the situation & tries again later.

The server does all the other work. It holds the population. It creates new organisms. It doles work orders (sequences & arrays) to clients. It generates reports at the end of each generation & of each run.

The easiest way to implement the server is probably as CGI programs accessed through my Web server. I'll implement them in Lisp because I like Lisp.

3 Data Structures

4 The Client Program

The client program should be short, simple, & portable so as not to deter people from installing it & making their computers act as clients.

Perl satisfies these. I can write the client program in Perl, keep it short & portable. Perl is installed on many systems. (I'd guess it's installed on almost all Unixes & even on many Winders systems.)

The client program must fetch a ``work order'' from the server. The work order contains a sequence of increments for Shell sort & a sequence of arrays to sort with the sequence of increments. While the client app sorts the arrays, it counts the number of swaps & comparisons it does. It reports them to the server, then gets another sequence of increments & sequence of arrays to sort.

Sometimes the server might not have work for the client. In that case, the server's reply could indicate it & could suggest an amount of time to wait before the client asks again for work.

Should a client have a GUID assigned by the server? It might be useful to distinguish between many clients that were behind a single gateway with NAT. It couldn't be used for secure identification because an untrustworthy client could just forget the GUID; it could always pretend it had never been assigned a GUID. If there were many honest clients behind the same NAT gateway, the GUIDs might be used to assign amounts of work so that each machine takes about the same amount of time to complete its work order. At least, the GUIDs could be used to keep statistics.¹

5 Security

Some clients might be dishonest or defective. The server could track that. Given some unknown client, the server might periodically ask a trusted client to verify the unknown client's work.

The server could track the reliability of a client. Over time, a trustworthy client would build a reputation. Its work would require independent verification less frequently. An untrustworthy client would be exposed by independent verification. Its requests for work orders would be virtually ignored; it would be told to try back later.

I guess if a client proved untrustworthy enough, it might be permanently blacklisted.

There is probably a mathematically determinable trust relationship that would dictate the frequency that a client's work would need verification by a trusted client.

Bibliography

Gene Michael Stover 2008-04-20