CrowdProcess is a browser-powered distributed computing platform.
It's in the SPMD (Single Program, Multiple Data) class of applications: "Tasks are split up and run simultaneously on multiple processors with different input in order to obtain results faster". Except that in CrowdProcess the processors are Web Workers, and they don't share any memory.
Web Workers connect to CrowdProcess from partner websites.
• • •
Run which will map an element of the dataset in a Web Worker to a result. The pair of each data element and program is called task and each browser connected to the platform at any given instant is inseminated by one or several tasks. The results computed from each task are returned to CrowdProcess and streamed back to the user.
- Job: a program plus input data. It yields results.
- Task: the execution of the program with input data. A task yields a result.
- Result: the outcome of a task.
FeaturesStupidly Scalable and Easy to Use
Each web browser is equivalent to a CPU core, so imagine having an easy to manage beast of 10k cores at your fingertips. Use it with a couple of lines.
Streaming and Realtime or Durable
Results are delivered as soon as they're computed, and you may also retrieve them at a later time.
CaveatsRunning in Web WorkersTask Running Time
Because tasks run on Web Workers, which in turn are on visitor's Web Browsers, you should plan them so that they take no more than the average visitor session lasts. The current session average lasts about 6m50s, but you'll get better results if your tasks take less time than that.
It's perfectly ok to make an extremely huge number of tasks and CrowdProcess deals with 1000 tasks that take 300 seconds as well as it deals with 10000 tasks that take 30 seconds. The second case though, will have higher chances of finishing during browser sessions.
Input data will be sent to the browsers, and they have different internet connections, some slow and some fast, so as a rule of thumb, it's a good idea to plan for both and split input data of about 1mb or less.
If input data is shared across tasks, it's a very good idea to make it part of the program because less data has to travel and it's more efficient.
Although every connection is SSL encrypted (from you to the browsers and back), each browser will have access to the tasks that are scheduled to it.
Tasks in browsers are ephemeral and exist only during computation time, nothing is stored ever on the browser.
If you have sensitive information, perhaps you might be interested in our Enterprise Grid solution.
Performance in CrowdProcess is affected by the number of browsers connected to the platform and their network bandwidth.
What runs well on CrowdProcess ?
Most embarrassingly parallel, CPU heavy, low I/O applications run well on CrowdProcess.