[mdlug] Parallella: A Supercomputer For Everyone by Adapteva — Kickstarter

Thu Oct 11 06:29:33 EDT 2012

On Wed, 2012-10-10 at 17:48 -0400, Aaron Kulkis wrote:
> Jonathan Billings wrote:
> > On Wed, Oct 10, 2012 at 12:34:16PM -0700, Art Dries wrote:
> >> I had a bioinformatics project a year ago that could have used this.
> >> (genetic analysis scales very well)
> >> Anyone into high-end crypto could use a pair for an end-to-end VPN.
> >> (isn't math fun?)
> >> The specs, API, and code will all be OPEN, so the possibities are endless.
> >> For $100, this would make an excellent (insert idea here)
> > True, but I am trying to dispell any notion that this thing will
> > instantly turn your laptop into a 45GHz system.  The extra cores on
> > the system are specialized, and you need to write code specifically to
> > use it.
> One simple solution:
> Guest OS running in the ARM-space, running its own set of processes
> compiled (or cross-compiled) to ARM
> Or even simpler... do the whole distro compiled for ARM, and then
> you get rid of the non-uniform hardware problem.
> If you gave me a 3 GHz quadcore, and gave me the option of trading
> that for a 48-core multi-CPU ARM machine running at only 1 GHz,
> yes, sure, single-threaded apps will take longer to complete---MAYBE...

This has already by tried by several vendors, Sun had the T-series which
was highly parallel.   You don't see these boxes everywhere in large
part because the theoretical advantage never really paid off.  On
something like a database server where you have lots of threads and/or
workers these boxes where supposed to be awesome.  They weren't.

Software really does have to be tweaked to get the bang-for-the-buck
from this type of setup.   And you are still going to run up against
other bottlenecks - primarily I/O and network.  Now you just have 100
concurrent processes making demands of those subsystems rather than 10.
End result is that is just doesn't go that much faster.

> Depending on machine load, that single-threaded app might STILL
> get more clock-cycles (since CPU contention goes way down due to
> the plethora of CPUs available), 

Nah.  CPUs like the i-family are very smart and internally concurrent.
Maybe if you have a cluster of really good CPUs, but a shoebox of ARMs
is going to get its butt kicked.

> and therefore complete faster
> than on a low-core, high clock-rate CPU.
> I have a 2006-era dual core... it's constantly getting bogged down
> whenever I want to do a couple CPU intensive things simultaneously.

A 2006 vintage machine [assuming it wasn't state-of-the-art at the time]
is going to hit lots of bottlenecks.  Most notably the front-side-bus
and the speed of the RAM.

> I don't need parallelized software...what I need are parallel cores.

An i7 can give you eight, and that is a lot.  A dual i7 can give you 16.

My laptop has an i7-2670QM CPU and I peg three cores @ 100% and it
remains very responsive.  If I swamp the hard-drive it turns into a
sled.

> > This is why I brought up the GPGPU use of video cards -- pretty much
> > the same thing as this, although it's being used already.  Yes, it is
> > cool that it's all open, but it'll be tough to beat the economics of
> > using cheap gaming cards on regular PCs.