One of my favorite systems papers ever is the COST paper, which examples a number of big-data platforms, and observes that many of them have the desirable property of scaling (near-)linearly with available hardware, but do so at the cost of being ludicrously less efficient than a tuned single-threaded implementation.