Deliverable 3.2 DataBench Toolbox – Alpha including support for reusing of existing benchmarks
This document presents the DataBench Toolbox initial architecture. The DataBench Toolbox will include formal mechanisms to ease the process of reusing existing or new big data benchmarks into a common framework that will help stakeholders to find, select, download, execute and get a set of homogenized metrics. The DataBench Toolbox will be an integral part of the DataBench Framework, which ultimately will deliver recommendations and business insights out of big data benchmarks.
The present document therefore starts by putting the DataBench Toolbox in context with the rest of the DataBench framework to later on dive into the details of the envisaged architecture. It is important to notice that the Toolbox will be based on existing efforts in big data benchmarking, rather than proposing new benchmarks. The DataBench Toolbox therefore aims to be an umbrella framework for big data benchmarking. The idea behind it is to provide ways to declare new benchmarks into the Toolbox and provide a set of automatisms and recommendations to allow the usage of these tools to become part of the ecosystem. Due to the different nature and technical scope of the existing tools, the degree of automation may vary from one tool to another. The baseline will be the possibility to download the selected benchmarking tools from the Toolbox web user interface, and to provide adapters to actually get the results of the benchmarking into the DataBench Toolbox in order to get homogenized technical metrics (i.e. throughput) that make comparable results from several benchmarks.