A common question from engineers new to Hadoop is
“Can we use these servers attached to SAN storage for Hadoop?”
The short answer is “Yes”, but perhaps the question should be –
“Is this server & storage correct for Hadoop?”
As with most software, Hadoop will run on just about anything.
However, Job performance, ability to scale and cluster cost will be impacted.
Carefully following proven design patterns for Hadoop will not only ensure the clusters ability to scale but yield job durations from defacto tests in expected ranges. The following paper compares the retail costs of “Standard” Hadoop servers using local storage compared to “Blade” servers using SAN attached storage.