Synchronous asynchronous updating cellular automata
Product datasheet Upgrade Price Guarantee Compatible with Windows 10, 8.1, 8, 7, Vista and XP, 32 Bit / 64 Bit Editions New language versions: Chinese - Lithuanian - Japanese - Polski - Romanian - Spanish - Czech Update Star is compatible with Windows platforms.
Update Star has been tested to meet all of the technical requirements to be compatible with Windows 10, 8.1, Windows 8, Windows 7, Windows Vista, Windows Server 2003, 2008, and Windows XP, 32 bit and 64 bit editions.
For clarity, I'm not looking for an optimal algorithm so much as something I can rapidly implement in CUDA that's likely to give a significant speedup over my CPU implementation.
Programmer time is much more of a limiting factor than computer time in this project.
I should also clarify that an cellular automaton is a rather different thing from a synchronous one, and techniques for parallelising synchronous CAs (such as Conway's life) cannot easily be adapted to this problem.
The difference is that a synchronous CA updates every cell simultaneously at every time step, whereas an asynchronous one updates a randomly chosen local region at every time step as outlined below.
This might have some [email protected] Compute that looks like a fantastic tool, but from my initial (rather cursory) investigation, it looks like the stencil code paradigm is inherently synchronous, so it's probably not well suited to what I'm trying to do. Can you provide a few more details on how you would parallelize this using SIMT? Or can the work involved with updating a single pair be spread over 32 or more threads?
) so I wouldn't have thought it makes much sense to spread it over multiple threads.
I need to give some more thought to whether the performance of this algorithm can be improved, and how to extend this algorithm to deal with the case where multiple cells are updated simultaneously in the ACA.
The idea is to replace the asynchronous CA (henceforth ACA) with a stochastic synchronous CA (SCA) that behaves equivalently.
To do this we first imagine that the ACA is a Poisson process.
(So variations of the chequerboard algorithm are not what I'm looking for.) The main difficulty in parallelising the above algorithm is collisions.
Because all the calculations depend only on a local region of the lattice, it's possible for many lattice sites to be updated in parallel, as long as their neighbourhoods aren't overlapping. I can think of several ways, but I don't know which if any is the best one to implement.