test1:
Name:
DataBench Project Fiche
Authors:
- Richard Stevens
- Cristina Pepato
Abstract:
NA
Name:
TheWebConf 2018
Authors:
- Marko Grobelnik
Abstract:
n/a
Name:
2019 BenchCouncil International Symposium on Benchmarking, Measuring and Optimizing (Bench'19)
Authors:
- Todor Ivanov
- Timo Eichhorn
- Arne Berre
- Tomás Pariente Lobo
- Iván Martínez
- Ricardo Ruiz
- Barbara Pernici
- Chiara Francalanci
Abstract:
In the era of Big Data and AI, it is challenging to know all technical and business advantages of the emerging technologies. The goal of DataBench is to design a benchmarking process helping organizations developing Big Data Technologies (BDT) to reach for excellence and constantly improve their performance, by measuring their technology development activity against parameters of high business relevance. This paper focuses on the internals of the DataBench framework and presents our methodological workflow and framework architecture.
MORE Information:
Name:
Promotion of Virtual BenchLearning (BDVA PPP Mailing List)
Authors:
- Ana María Morales
Abstract:
Promotion of 3rd Virtual BenchLearning (BDVA PPP Mailing List)
Name:
Promotion of Virtual BenchLearning (IDC Mailing List)
Authors:
- Cristina Pepato
Abstract:
Promotion of 3rd Virtual BenchLearning (IDC Mailing List)
Name:
Promotion of DataBench Virtual BenchLearning - Atos Internal Newsletter
Authors:
- Ana María Morales
Abstract:
Promotion of 2nd DataBench Virtual BenchLearning -Atos Internal Newsletter Iberia (1100 contacts)Atos Research and Innovation Internal Newsletter (180 contacts)
Name:
Promotion of Virtual BenchLearning (BDVA PPP Mailing List)
Authors:
- Ana María Morales
Abstract:
Promotion of Virtual BenchLearning (BDVA PPP Mailing List)
Name:
Promotion of DataBench Virtual BenchLearning - Atos Internal Newsletter
Authors:
- Ana María Morales
Abstract:
Promotion of DataBench Virtual BenchLearning -Atos Internal Newsletter Iberia (1100 contacts)Atos Research and Innovation Internal Newsletter (180 contacts)
Name:
Promotion of Virtual BenchLearning (IDC Mailing List)
Authors:
- Cristina Pepato
Abstract:
Promotion of Virtual BenchLearning (IDC Mailing List)
Name:
Promotion of Virtual BenchLearning (BDVA PPP Mailing List)
Authors:
- Ana María Morales
Abstract:
Promotion of Virtual BenchLearning (BDVA PPP Mailing List)
Name:
EBDVF 2019
Authors:
- Richard Stevens
- Stefania Aguzzi
Abstract:
DataBench promotion @IDC booth
Name:
ICT 2018
Authors:
- Nuria de Lama Sanchez
- Stefania Aguzzi
- Cristina Pepato
Abstract:
Presence at the booth and distribution of marketing materials, audience engagement, DataBench video display
Name:
DataBench Roll-up
Authors:
- Nuria de Lama Sanchez
- Ana María Morales
- Stefania Aguzzi
- Cristina Pepato
Abstract:
NA
Name:
DataBench Survey Infographic
Authors:
- Ana María Morales
Abstract:
DataBench project has developed the first of a series of infographics presenting the results of a survey about Big Data and Analytics (BDA) on 700 European companies.
The survey carried out by DataBench between September and October 2018 investigated the actual or planned use of BDA by 700 European businesses in 11 EU Member States and identified the relevance of business KPIs for BDA users. The detailed analysis is presented in the report D2.2 of DataBench Project.
The infographic presents the main highlights of the survey concerning:
Level of adoption of BDA solutions by EU businesses from different Industries and Business Areas
Business goals driving the adoption of this type of technologies
Most relevant KPI categories for measuring the business impact
Achievements and expected benefits for using BDA solutions
Current use of analytic techniques
Current level of Big Data skills gap
https://www.databench.eu/wp-content/uploads/2019/03/databench_infographic1.pdf
https://www.databench.eu/big-data-analytics-big-opportunities-for-eu-companies-have-a-look-at-the-first-databench-infographic/
Name:
Flyer
Authors:
- Ana María Morales
- Nuria de Lama Sanchez
- Stefania Aguzzi
- Cristina Pepato
Abstract:
Flyer to download the content of the WhitePaper: DataBench Toolbox
Name:
DataBench Handout
Authors:
- Nuria de Lama Sanchez
- Ana María Morales
- Stefania Aguzzi
- Cristina Pepato
Abstract:
NA
Name:
Bled Strategic Forum
Authors:
- Marko Grobelnik
Abstract:
At times our society feels like a runaway train. Technologies in artificial intelligence, biotech, nanotech, to name only a few fields, are developing at an extreme pace, but are not accompanied by a strategic analysis of their impact not only on our daily lives but the whole of humanity – on social relations, on our emotional and biological selves, as well as on our legal systems and regulatory frameworks. The infrastructure for such fundamental changes is not in place.
The panel will explore cutting edge technological advances and how they will affect our lives and the human race as a whole. What changes can we foresee in the coming decades? How can we ensure that they will lead us to a better existence and that we use technology to improve our lives and not to perpetuate the cycles of global violence and wars that mark human history?
Name:
2018 BenchCouncil International Symposium on Benchmarking, Measuring and Optimizing (Bench 18)
Authors:
- Arne Berre
Abstract:
Benchmarking for Digital Platforms with Big Data, IoT, AI, Cloud, HPC and CyberSecurity is being introduced based on European activities in this area, in particular related to the DataBench project and work in BDVA, AIOTI and ECSO.
Name:
SIGIR 2018
Authors:
- Rayid Ghani
- Marko Grobelnik
Abstract:
Can data science help reduce police violence and misconduct? Can it help increase retention of patients in care? Can it help prevent children from getting lead poisoning? Can it help cities better target limited resources to improve lives of citizens? Were all aware of the hype around data science and related buzzwords right now but turning this hype into social impact takes cross-disciplinary training, teams, and methods.
Name:
EBDVF'2018
Authors:
- Arne Berre
Abstract:
Benchmarking session at EBDVF'2018
Introduction - Arne Berre/Axel Ngonga
Designing Big Data Benchmarks - Irini Fundulaki
LDBC - Peter Boncz
DataBench - Gabriella Cattaneo/Tomas P. Lobo
Holistic Benchmarking - Axel Ngonga/Gayane Sedrakyan
Benchmarking as a service - Pavel Smirnov (AGT)
The EU Big Data Inducement price challenge - Kimmo Rossi (EC)
Presentation of solutions of winners of the EU Big Data Inducement Prize
Summary and Discussion
Name:
24th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
Authors:
- Dunja Mladenic
- James Hodson
- Marko Grobelnik
Abstract:
Moderation of KDD 2018 Project Showcase
Name:
DB Test '20
Authors:
- Todor Ivanov
- Ahmad Ghazal
- Alain Crolotte
- Pekka Kostamaa
- Yoseph Ghazal
Abstract:
Significant effort was put into big data benchmarking with focus on end-to-end applications. While covering basic functionalities implicitly, the details of the individual contributions to the overall performance are hidden. As a result, end-to-end benchmarks could be biased toward certain basic functions. Micro-benchmarks are more explicit at covering basic functionalities but they are usually targeted at some highly specialized functions. In this paper we present CoreBigBench, a benchmark that focuses on the most common big data engines/platforms functionalities like scans, two way joins, common UDF execution and more. These common functionalities are benchmarked over relational and key-value data models which covers majority of data models. The benchmark consists of 22 queries applied to sales data and key-value web logs covering the basic functionalities. We ran CoreBigBench on Hive as a proof of concept and verified that the benchmark is easy to deploy and collected performance data. Finally, we believe that CoreBigBench is a good fit for commercial big data engines performance testing focused on basic engine functionalities not covered in end-to-end benchmarks.
MORE Information:
Name:
Encyclopedia of Big Data Technologies
Authors:
- Todor Ivanov
- Roberto V. Zicari
Abstract:
This chapter reviews the evolution of the analytics benchmarks and their current state today (as of 2017). It starts overview of the most relevant benchmarking organizations their benchmark standards and outlines the latest benchmark development and initiatives targeting the emerging Big Data Analytics systems. Last but not least the typical benchmark components are described as well as the different goals that these benchmarks try to achieve.
MORE Information:
Name:
DBTest@SIGMOD 2018
Authors:
- Todor Ivanov
- Roberto V. Zicari
- Ahmad Ghazal
- Patrick Bedué
Abstract:
BigBench standardized as TPCx-BB is a popular application benchmark that targets Big Data storage and processing systems. BigBench V2 addresses some of the BigBench limitations by introducing a new simplified data model, semi-structured web logs in JSON file format and new queries mandating late binding. However, it still covers only batch processing workloads and the Big Data velocity characteristic is not addressed. This work extends the BigBench V2 benchmark with a data streaming component that simulates typical statistical and predictive analytics queries in a retail business scenario. Our approach is to preserve the existing BigBench design and introduce a new streaming component that supports two data streaming modes: active and passive. In active mode, the data stream generation and processing happen in parallel, whereas in passive mode, the data stream is pre-generated in advance before the actual stream processing. The stream workload consists of five queries inspired by the existing 30 BigBench queries. To validate the proposed streaming extension, the two streaming modes were implemented and tested using Kafka and Spark Streaming. The experimental results prove the feasibility of our benchmark design. Finally, we outline design challenges and future plans for improving the proposed BigBench extension.
MORE Information:
Name:
ICPE 2018, Berlin
Authors:
- Todor Ivanov
- Jason Taaffe
Abstract:
In the Big Data era, stream processing has become a common requirement for many data-intensive applications. This has lead to many advances in the development and adaption of large scale streaming systems. Spark and Flink have become a popular choice for many developers as they combine both batch and streaming capabilities in a single system. However, introducing the Spark Structured Streaming in version 2.0 opened up completely new features for SparkSQL, which are alternatively only available in Apache Calcite. This work focuses on the new Spark Structured Streaming and analyses it by diving into its internal functionalities. With the help of a micro-benchmark consisting of streaming queries, we perform initial experiments evaluating the technology. Our results show that Spark Structured Streaming is able to run multiple queries successfully in parallel on data with changing velocity and volume sizes.
MORE Information:
Name:
ICPE 2018, Berlin
Authors:
- Todor Ivanov
- Rekha Singhal
Abstract:
Distributed big data processing and analytics applications demand a comprehensive end-to-end architecture stack consisting of big data technologies. However, there are many possible architecture patterns (e.g. Lambda, Kappa or Pipeline architectures) to choose from when implementing the application requirements. A big data technology in isolation may be best performing for a particular application, but its performance in connection with other technologies depends on the connectors and the environment. Similarly, existing big data benchmarks evaluate the performance of different technologies in isolation, but no work has been done on benchmarking big data architecture stacks as a whole. For example, BigBench (TPCx-BB) may be used to evaluate the performance of Spark, but is it applicable to PySpark or to Spark with Kafka stack as well? What is the impact of having different programming environments and/or any other technology like Spark? This vision paper proposes a new category of benchmark, called ABench, to fill this gap and discusses key aspects necessary for the performance evaluation of different big data architecture stacks.
MORE Information:
Name:
BIG DATA CHALLENGES IN SMART MANUFACTURING INDUSTRY - A Whitepaper on Digital Europe Big Data Challenges for Smart Manufacturing Industry Version 2020
Authors:
- Gabriella Cattaneo
- Sergio Gusmeroli
Abstract:
This is the second edition of this paper about Big Data challenges in the Smart Manufacturing Industry. In the 2018 edition, the focus was on the alignment between the BDVA Reference Model and the EFFRA Strategic Research and Innovation Agenda through analysis of their respective reference architectures and alignment of their respective technical challenges. The main outcome of the 2018 activities was the identification 56 Big Data technical challenges in the three Manufacturing Grand Scenarios of Smart Factory, Smart Product and Smart Supply Chain. Now in the 2020 edition, a twofold approach was chosen: on the one side (continuity evolutionary approach), we are updating the previous content, not just of the technology but also of the legal-sociobusiness landscape. On the other side, H2020 is in its last period and we need to think of more disruptive and revolutionary topics to be implemented along the future Multiannual Financial Framework (MFF 2021-2027) and respectively in both Digital Europe and Horizon Europe programs. This second approach will continue in the work of the SMI group on elaborating the bridge between H2020 and Horizon / Digital Europe.
MORE Information:
Name:
KDD 2018, London
Authors:
- Todor Ivanov
- Roberto V. Zicari
- Tomás Pariente Lobo
- Nuria de Lama Sanchez
- Arne Berre
- Volker Hoffmann
- Richard Stevens
- Gabriella Cattaneo
- Helena Schwenk
- Cristopher Ostberg-Hansen
- Cristina Pepato
- Barbara Pernici
- Chiara Francalanci
- Angela Geronazzo
- Lucia Polidori
- Paolo Giacomazzi
- Marko Grobelnik
- James Hodson
Abstract:
Organisations rely on evidence from the Benchmarking domainto provide answers on how their processes are performing.There is extensive information on how and whyto perform technical benchmarks for the specific managementand analytics processes, but there is a lack of objective,evidence-based methods to measure the correlationbetween Big Data Technology (BDT) benchmarks and anorganisations business benchmarks and demonstrate returnon investment (ROI). The DataBench project addresses thissignificant gap in the current benchmarking communitysactivities, by providing certifiable benchmarks and evaluationschemes of BDT performance of high business impactand industrial significance.
MORE Information:
Name:
XV itAIS Conference, Pavia
Authors:
- Barbara Pernici
- Chiara Francalanci
- Angela Geronazzo
- Lucia Polidori
- Leonardo Riva
- Stefano Ray
- Arne Berre
- Todor Ivanov
Abstract:
The use of big data in organizations involves numerous decisions on the business and technical side. While the assessment of technical choices has been studied introducing technical benchmarking approaches, the study of the value of big data and of the impact of business key performance indicators (KPI) on technical choices is still an open problem. The paper discusses a general analysis framework for analyzing big data projects wrt both technical and business performance indicators, and presents the initial results emerging from a first empirical analysis conducted within European companies and research centers within the European DataBench project and the activities of the benchmarking working group of the Big Data Value Association (BDVA). An analysis method is presented, discussing the impact of confidence and support measurements and two directions of analysis are studied: the impact of business KPIs on technical parameters and the study of most important indicators both on the business and on the technical side, for specific industry sectors, with the goal of identifying the most relevant design and assessment criteria.
MORE Information:
Name:
2019 BenchCouncil International Symposium on Benchmarking
Authors:
- Todor Ivanov
- Tomás Pariente Lobo
- Arne Berre
- Iván Martínez
- Ricardo Ruiz
- Barbara Pernici
- Chiara Francalanci
Abstract:
In the era of Big Data and AI, it is challenging to know alltechnical and business advantages of the emerging technologies. The goalof DataBench is to design a benchmarking process helping organizationsdeveloping Big Data Technologies (BDT) to reach for excellence andconstantly improve their performance, by measuring their technologydevelopment activity against parameters of high business relevance. Thispaper focuses on the internals of the DataBench framework and presentsour methodological workow and framework architecture.
MORE Information:
Name:
3rd Big Data Innovation Conference 2018
Authors:
- Todor Ivanov
Abstract:
Improving Business Performance Through Big Data Benchmarking
MORE Information:
Name:
17th International Conference on Numerical Combustion
Authors:
- Barbara Pernici
Abstract:
DataBench (Evidence Based Big Data Benchmarking to Improve Business Performance) is a H2020 project started in January 2018 with the goal of providing indicators and metrics to evaluate benchmarks available for assessing Big Data technologies, including data analysis systems. Existing benchmarks range from micro-benchmarks, focusing on a specific technological aspect, to end-to-end benchmarks to evaluate business intelligence applications as a whole and identify maximum workloads and possible weaknesses. The talk will present the indicators which have been studied in the project and include Business features, Big Data Applications Features, Platform and Architecture Features, and Technical Benchmark Features. The results of an initial survey of use of big data technologies in Europe with approximately 700 respondents will be illustrated and the requirements for different industry sectors analyzed. A specific focus on evaluating big data technologies for scientific data will be presented and discussed, with the aim of gathering further requirements for benchmarking tools for scientific applications, and in particular in the combustion. Some early results and further requirements for innovative cooperative information systems for storing, curating, managing, and analyzing experiments and models in the combustion domain will also be illustrated. DataBench web site: https://www.databench.eu/
MORE Information:
Name:
DataBench Webinar - BDVe Webinar Series
Authors:
- Tomás Pariente Lobo
Abstract:
DataBench: Benchmarking big data. Introduction to the webinar
Tomás Pariente Lobo, Atos
DataBench Webinar - BDVe Webinar Series
October 9 2018
MORE Information:
Name:
DBTest 2020
Authors:
- Todor Ivanov
Abstract:
Presentation of research article: "CoreBigBench: Benchmarking big data core operations" Authors: Todor Ivanov, Ahmad Ghazal, Alain Crolotte, Pekka Kostamaa and Yoseph Ghazal
Name:
DataBench Webinar - BDVe Webinar Series
Authors:
- Gabriella Cattaneo
- Richard Stevens
Abstract:
Building a bridge between technical and business benchmarking
Gabriella Cattaneo, Richard Stevens, IDC
DataBench Webinar - BDVe Webinar Series
October 9 2018
MORE Information:
Name:
DataBench Webinar - BDVe Webinar Series
Authors:
- Arne Berre
- Tomás Pariente Lobo
- Todor Ivanov
Abstract:
Big data technical benchmarking
Arne J. Berre, SINTEF, Todor Ivanov, Goethe University, Tomás Pariente Lobo, Atos
DataBench Webinar - BDVe Webinar Series
October 9 2018
MORE Information:
Name:
ABench: Big Data Architecture Stack Benchmark
Authors:
- Todor Ivanov
Abstract:
ABench: Big Data Architecture Stack Benchmark
Todor Ivanov, Frankfurt Big Data Lab, Goethe University, Rekha Singhal, TATA Consultancy Services
ICPE 2018, Berlin
April 9-13 2018
MORE Information:
Name:
Exploratory Analysis of Spark Structured Streaming
Authors:
- Todor Ivanov
Abstract:
Exploratory Analysis of Spark Structured Streaming
Todor Ivanov, Jason Taafe, Frankfurt Big Data Lab, Goethe University
ICPE 2018, Berlin
April 9-13 2018
MORE Information:
Name:
European Big Data Value Forum 2018
Authors:
- Gabriella Cattaneo
- Mike Glennon
- Tomás Pariente Lobo
Abstract:
Building a bridge between technical and business benchmarking
Gabriella Cattaneo, Mike Glennon, IDC, Tomás Pariente Lobo, ATOS
European Big Data Value Forum 2018, Vienna
12 November 2018
MORE Information:
Name:
XV itAIS Conference, Pavia
Authors:
- Barbara Pernici
- Chiara Francalanci
- Angela Geronazzo
- Lucia Polidori
- Stefano Ray
- Leonardo Riva
- Arne Berre
- Todor Ivanov
Abstract:
Relating big data business and technical performance indicators
Barbara Pernici, Chiara Francalanci, Angela Geronazzo, Lucia Polidori, Stefano Ray, Leonardo Riva, Politecnico di Milano, Arne Jørgen Berre, SINTEF, Todor Ivanov, Goethe University
itAIS 2018, Pavia
12 October 2018
MORE Information:
Name:
ICT 2018
Authors:
- Richard Stevens
Abstract:
Impacts of data-driven AI in Business Sectors
Richard Stevens, IDC
ICT 2018, Vienna
5 December 2018
MORE Information:
Name:
OpenExpo Europe 2019
Authors:
- Tomás Pariente Lobo
- Ana María Morales
Abstract:
Tomás Pariente presentation about DataBench and Benchmarking Big Data was most voted on the Call4Speakers for the BI and Analytics session. The presentation provided a general overview of the project with a special focus on the development of the Toolbox.
MORE Information:
Name:
AI Governance Forum 2019
Authors:
- Marko Grobelnik
Abstract:
Cross-lingual Real-Time Global Media Monitoring
Name:
Workshop The Second IEEE International Workshop on Benchmarking, Performance Tuning and Optimization for Big Data Applications (BPOD) in 2018 IEEE International Conference on Big Data
Authors:
- Arne Berre
- Gabriella Cattaneo
- Barbara Pernici
- Tomás Pariente Lobo
- Roberto V. Zicari
Abstract:
Benchmarking for Big Data Applications with the DataBench Framework - presented at the Workshop The Second IEEE International Workshop on Benchmarking, Performance Tuning and Optimization for Big Data Applications (BPOD) in 2018 IEEE International Conference on Big Data
Name:
BDVA PPP Newsletter
Authors:
- Ana María Morales
- Stefania Aguzzi
- Cristina Pepato
Abstract:
DataBench press release about the paper "Big Data Key Performance Indicators" for the iTAIS Conference was published at the BDVA PPP Newsletter (September) and BVDA Website
http://www.big-data-value.eu/databench-paper-big-data-key-performance-indicators-accepted-at-itais-conference/
http://www.bdva.eu/sites/default/files/PDF_Newsletter_SEPT_2018.pdf
Name:
BDVA PPP Newsletter
Authors:
- Cristina Pepato
- Stefania Aguzzi
Abstract:
DataBench press release about the call for industry case studies was published in the BDVA PPP Newsletter (May) and BVDA Website
Name:
Atos Research and Innovation Newsletter
Authors:
- Ana María Morales
Abstract:
DataBench on the Atos Research and Innovation Newsletter (Events section), to communicate the participation of DataBench at the event #OpenExpoEurope in Madrid.
Name:
Atos Research and Innovation Newsletter
Authors:
- Ana María Morales
Abstract:
DataBench in the Atos Research and Innovation Newsletter (Projects section), about the General Meeting held by Atos in Madrid
Name:
BDVA PPP Newsletter
Authors:
- Ana María Morales
- Stefania Aguzzi
- Cristina Pepato
Abstract:
DataBench press release about the participation at KDD 2018 was published at the BDVA PPP Newsletter (September) and BVDA Website
http://www.big-data-value.eu/the-databench-project-was-presented-at-the-kdd-conference-2018-in-london/
http://www.bdva.eu/sites/default/files/PDF_Newsletter_SEPT_2018.pdf
Name:
BDVA PPP Newsletter
Authors:
- Ana María Morales
- Stefania Aguzzi
- Cristina Pepato
Abstract:
DataBench press release about the performance of the project during 2018 was published at the BDVA PPP Newsletter (December) and BVDA Website
http://www.big-data-value.eu/databench-project-finishes-its-first-year-of-activity-with-successful-results/
http://www.bdva.eu/sites/default/files/PDF_Newsletter_DEC_2018.pdf
Name:
BDVA PPP Newsletter
Authors:
- Ana María Morales
Abstract:
DataBench press release about the results of the survey were published at the BDVA PPP Newsletter (February) and BVDA Website
http://www.big-data-value.eu/databench-project-released-the-results-of-its-survey-on-700-european-companies-focused-on-their-actual-or-planned-use-of-big-data-and-analytics/
http://www.bdva.eu/sites/default/files/PDF_Newsletter_February_2019_compressed.pdf
Name:
BDVA PPP Newsletter
Authors:
- Ana María Morales
Abstract:
A contribution for June PPP Newsletter was sent to promote the new page of DataBench in LinkedIn.
Name:
BDVA PPP Newsletter
Authors:
- Ana María Morales
Abstract:
DataBench press release about the creation of a Benchmarking Community was published at the BDVA PPP Newsletter (January) and BVDA Website
http://www.big-data-value.eu/databench-call-for-action-are-you-working-on-projects-related-to-big-data-that-uses-or-develop-benchmarks/?et_fb=1
http://www.bdva.eu/sites/default/files/PDF_Newsletter_January_2019.pdf
Name:
BDVA News Section
Authors:
- Ana María Morales
Abstract:
First webinar scheduled for the BDVe Webinar Series: DataBench Project
Name:
BDVA News Section
Authors:
- Ana María Morales
Abstract:
Results of the webinar held by DataBench with Arne Berre and Gabriella Cattaneo
Name:
Atos Research and Innovation Newsletter
Authors:
- Ana María Morales
Abstract:
DataBench on the Atos Research and Innovation Newsletter (Events section), for the promotion of the BDVe Series webinar with Arne Berre and Gabriella Cattaneo, and the acceptance of the paper for the itAIS Conference
Name:
BDVA PPP Newsletter
Authors:
- Ana María Morales
Abstract:
DataBench press release about the infographic developed with the main highlights of the survey was published at the BDVA PPP Newsletter (March) and BDVA Website.
http://www.big-data-value.eu/databench-infographic-based-on-a-survey-on-700-european-companies/
Name:
Atos Research and Innovation Newsletter
Authors:
- Ana María Morales
Abstract:
DataBench on the Atos Research and Innovation Newsletter (Events section), for the promotion of the project's participation at the EBDVF2018
Name:
Atos Research and Innovation Newsletter
Authors:
- Ana María Morales
Abstract:
DataBench on the Atos Research and Innovation Newsletter (Events section), for the promotion of the project's participation at the ICT2018
Name:
Atos Research and Innovation Newsletter
Authors:
- Ana María Morales
Abstract:
DataBench on the Atos Research and Innovation Newsletter (Projects section), for the promotion of the project's new Scientific Publications Section on the website
Name:
DBTest '20: Proceedings of the workshop on Testing Database Systems June 2020 Article No.: 4
Authors:
- Todor Ivanov
Abstract:
Significant effort was put into big data benchmarking with focus on end-to-end applications. While covering basic functionalities implicitly, the details of the individual contributions to the overall performance are hidden. As a result, end-to-end benchmarks could be biased toward certain basic functions. Micro-benchmarks are more explicit at covering basic functionalities but they are usually targeted at some highly specialized functions. In this paper we present CoreBigBench, a benchmark that focuses on the most common big data engines/platforms functionalities like scans, two way joins, common UDF execution and more. These common functionalities are benchmarked over relational and key-value data models which covers majority of data models. The benchmark consists of 22 queries applied to sales data and key-value web logs covering the basic functionalities. We ran CoreBigBench on Hive as a proof of concept and verified that the benchmark is easy to deploy and collected performance data. Finally, we believe that CoreBigBench is a good fit for commercial big data engines performance testing focused on basic engine functionalities not covered in end-to-end benchmarks.
MORE Information:
Name:
Promotion of DataBench Virtual BenchLearning from @AtosES
Authors:
- Ana María Morales
Abstract:
To update metrics
Name:
Promotion of DataBench presentation at OpenExpoEurope from @AtosES
Authors:
- Ana María Morales
Abstract:
https://twitter.com/AtosES/status/1106126834205421568 | March 14 (Twitter)
https://twitter.com/AtosES/status/1106474123025399808 | March 15 (Twitter)
https://twitter.com/AtosES/status/1103660852987740160 | March 7 (Twitter)
https://twitter.com/AtosES/status/1103932371055722496 | March 8 (Twitter)
https://www.linkedin.com/feed/update/urn:li:activity:6511537689295228928 | March 14 (LinkedIn)
Name:
Promotion of DataBench Virtual BenchLearning from @AtosES
Authors:
- Ana María Morales
Abstract:
66 Likes 8 Media Clicks 26 Shares 25 Detail Expands 2 Comments 100% Organic Engagement
Name:
Promotion of DataBench BDVe Webinar from @AtosES
Authors:
- Ana María Morales
Abstract:
https://bit.ly/2DvwFaD / https://bit.ly/2GEKtCW / https://bit.ly/2RTzl77
Name:
Promotion of DataBench participation at EBDVF2018 from @AtosES
Authors:
- Ana María Morales
Abstract:
https://bit.ly/2N2yjVr / https://bit.ly/2TJXU83 / https://bit.ly/2UQB52I / https://bit.ly/2UWO8zP / https://bit.ly/2N2IrgP / https://bit.ly/2V0Dubp
Name:
Promotion of DataBench participation at ICT2018 from @AtosES
Authors:
- Ana María Morales
Abstract:
https://bit.ly/2SL5qCq / https://bit.ly/2SyFHhu / https://bit.ly/2Bzlldh / https://bit.ly/2N3rFOw
Name:
Promotion of DataBench Video at @AtosES
Authors:
- Ana María Morales
Abstract:
https://twitter.com/AtosES/status/1106109974307049472 |
https://twitter.com/AtosES/status/1103225970230726657
Name:
ICPE 2020
Authors:
- Todor Ivanov
- Rekha Singhal
Abstract:
The proliferation of big data technology and faster computing systems led to pervasions of AI based solutions in our life. There is need to understand how to benchmark systems used to build AI based solutions that have a complex pipeline of pre-processing, statistical analysis, machine learning and deep learning on data to build prediction models. Solution architects, engineers and researchers may use open-source technology or proprietary systems based on desired performance requirements. The performance metrics may be data pre-processing time, model training time and model inference time. We do not see a single benchmark answering all questions of solution architects and researchers. This tutorial covers both practical and research questions on relevant Big Data and Analytics benchmarks.
MORE Information:
Name:
DataBench Video
Authors:
- Ana María Morales
- Tomás Pariente Lobo
- Nuria de Lama Sanchez
- Cristina Pepato
- Stefania Aguzzi
Abstract:
DataBench Video shows the objectives of the project, the audience targeted, and the expected outcomes. This is the first of a series of videos that will be produced.
MORE Information:
Name:
IDC Survey
Authors:
- Erica Spinoni
Abstract:
This IDC Survey presents the key findings from IDC's European DataBench Survey on the approach to data management and the benefits derived from the deployment of Big Data analytics (BDA) solutions. The survey, conducted in October 2018, included 700 respondents in Europe and covered the most important industry sectors. The companies sampled are based in the most important Western and Eastern European countries.
The survey, which is part of a project funded by the European Commission, was conducted to understand the current European BDA environment and provide European companies with fruitful insights and tools to benchmark themselves on the European market. Visit https://www.databench.eu/ for more information about the project.
"Real-time solutions for Big Data analytics are powerful tools that can enable businesses to improve their management styles, deepen their understanding of daily routines, and achieve better performance and process improvements," said Erica Spinoni, research analyst, IDC Europe.
MORE Information:
Name:
Promotion of Virtual BenchLearning (BDVA PPP Website)
Authors:
- Ana María Morales
Abstract:
Publication of article on News Section
VIRTUAL BENCHLEARNING: ASSESSING THE PERFORMANCE AND IMPACT OF BIG DATA, ANALYTICS AND AI
Name:
IDC Blog Post
Authors:
- Erica Spinoni
Abstract:
The contextual complexity around big data is increasing exponentially. Governance and regulation, data volume and data variety, the velocity of transmission and computation, and the potential for adoption of open source technologies are a few of the many challenges that companies are facing in the Big Data environment. In addition, digital transformation (DX) in Europe is starting to play a significant role as a basis to enhance the European market position and for a future gain in competitiveness.
MORE Information:
Name:
Promotion of Virtual BenchLearning (BDVA PPP Website)
Authors:
- Ana María Morales
Abstract:
Publication of article on News Section DID YOU MISS THE 1ST VIRTUAL BENCHLEARNING BY DATABENCH? CHECK THE RECORDING AND THE PRESENTATIONS NOW!
Name:
Promotion of Virtual BenchLearning (BDVA PPP Website)
Authors:
- Ana María Morales
Abstract:
Publication of article on News Section
DID YOU MISS THE 2ND VIRTUAL BENCHLEARNING BY DATABENCH? CHECK THE RECORDING AND THE PRESENTATIONS NOW!
Name:
Promotion of Virtual BenchLearning (BDVA PPP Website)
Authors:
- Ana María Morales
Abstract:
Publication of article on News Section
SUCCESS STORIES ON BIG DATA & ANALYTICS USE CASES AND DATABENCH TOOLBOX
Name:
IDC Survey
Authors:
- Erica Spinoni
Abstract:
This IDC Survey presents key findings from IDC's European DataBench Survey on the benefits that European companies are achieving or expect to achieve through the deployment of Big Data and analytics (BDA) solutions. Conducted in October 2018, the survey covered 700 respondents in key industry sectors in 11 European countries, grouped by region (Nordics, South Europe, and Central and Eastern Europe). The survey, as part of a European Commission-funded project, was conducted to understand the current European BDA environment and provide organizations with actionable insights and tools to benchmark themselves on the European market. For more information about the project, visit www.databench.eu.
"European companies are achieving important benefits from BDA deployment, particularly in deepening their understanding of customer behavior. Companies must adopt prescriptive and predictive analytics techniques if they want remain competitive and improve market performance," said Erica Spinoni, research analyst, IDC Europe.
Name:
IDC Survey Spotlight
Authors:
- Erica Spinoni
Abstract:
This IDC Survey Spotlight provides details on the metrics used by European companies to evaluate their Big Data and analytics (BDA) environments and benchmark the impact of BDA solutions already in place. Respondents include European companies with more than 10 employees covering the most important industry sectors.
The analysis is based on data from IDC's EMEA EU DataBench Survey, October 2018.
Name:
IDC Survey Spotlight
Authors:
- Erica Spinoni
Abstract:
This IDC Survey Spotlight provides details on the importance for European companies to evaluate their Big Data and Analytics (BDA) environments with the right set of technical performance metrics. Respondents are European companies with more than 10 employees, covering the most important industry sectors.
The analysis is based on data from IDC's EMEA EU DataBench Survey in October 2018.
Name:
Promotion of Virtual BenchLearning (BDVA PPP Website)
Authors:
- Ana María Morales
Abstract:
Publication of article on News Section BIG DATA – BENCHMARK YOUR WAY TO EXCELLENT BUSINESS PERFORMANCE
Name:
IDC Survey Spotlight
Authors:
- Erica Spinoni
Abstract:
This IDC Survey Spotlight examines how much European organizations have integrated their Big Data analytics (BDA) environments with key Innovation Accelerator technologies. Survey respondents are European companies with more than 10 employees and cover the most important industry sectors.
The analysis is based on data from IDC's European DataBench Survey, October 2018.
MORE Information:
Name:
Virtual BenchLearning Big Data – Benchmark your Way to Excellent Business Performance
Authors:
- Gabriella Cattaneo
- Erica Spinoni
Abstract:
DataBench project goal is to design a benchmarking process helping European organizations developing Big Data Technologies to reach for excellence and improve their performance, by measuring their technology development activity against parameters of high business relevance. DataBench is investigating existing Big Data benchmarking tools and projects, identifying the main gaps and developing a robust set of metrics to compare technical results coming from those tools that will be available on the main outcome of the project the DataBench Toolbox.
In this 45 minute session titled Big Data – Benchmark your way to excellent business performance our speakers and representative of the DataBench Project, Gabriella Cattaneo and Erica Spinoni from IDC, presented the results of the research on Big Data business impacts by European industries. Listen to this webinar to understand how different industries implement Big Data Technologies and benchmark activities, for measuring business-related KPIs such as revenue growth, profitability and cost savings, customer satisfaction, among others, based on a survey conducted by IDC to 700 EU-industry-representative companies.
Name:
Virtual BenchLearning "ASSESSING THE PERFORMANCE AND IMPACT OF BIG DATA, ANALYTICS AND AI"
Authors:
- Arne Berre
- Tomás Pariente Lobo
- Richard Stevens
Abstract:
During this 1-hour session organised in collaboration with BDVA, different speakers from DataBench project provided the audience with a framework and tools to assess the performance and impact of Big Data and AI technologies from the technical perspective, by providing real insights coming from DataBench and other projects active in the benchmarking domain in various industrial sectors.
In addition, representatives from other projects part of the BDV PPP such as DeepHealth and I-BiDaaS participated to share the challenges and opportunities they have identified on the use of Big Data, Analytics, AI.
The webinar featured:
An introductory session describing the need for DataBench in Big Data and AI centric analytics and its use in comparing the performance indicators that are important to organisations using them.
A demonstration of the DataBench Toolbox and the advantages it brings for organisations that need to assess their data analytics processes.
A discussion with other BDV PPP projects about their needs in terms of benchmarking big data.
Name:
Virtual BenchLearning "Success stories on Big Data & Analytics use cases + DataBench Toolbox"
Authors:
- Chiara Francalanci
- Tomás Pariente Lobo
Abstract:
In this 1 hour session titled Success stories on Big Data & Analytics Use Cases + DataBench Toolbox our speakers and participants of the DataBench Project, Chiara Francalanci from Politecnico di Milano and Tomás Pariente from Atos Research and Innovation, presented the most relevant use cases and success stories on Retail and Manufacturing sectors implementing Big Data, Analytics and Benchmarking tools. The session included a demonstration of the main outcome of the project, the DataBench Toolbox, which allows different types of users from both technical and business roles, to search, select and deploy Big Data benchmarking tools to generate unified technical metrics and derive business KPIs.
Name:
DataBench Toolbox Architecture
Authors:
- Tomás Pariente Lobo
Abstract:
This white paper reports on the current view of the DataBench Toolbox architecture and main functional elements as described in the DataBench deliverable D3.1. The goal of the DataBench Toolbox is to provide a way of reusing existing big data benchmarking efforts under a common framework, providing therefore a way to select, download and homogenize technical and business indicators.
Name:
DBTest 2018
Authors:
- Todor Ivanov
- Roberto V. Zicari
Abstract:
BigBench standardized as TPCx-BB is a popular application benchmark that targets Big Data storage and processing systems. BigBench V2 addresses some of the BigBench limitations by introducing a new simplified data model, semi-structured web logs in JSON file format and new queries mandating late binding. However, it still covers only batch processing workloads and the Big Data velocity characteristic is not addressed. This work extends the BigBench V2 benchmark with a data streaming component that simulates typical statistical and predictive analytics queries in a retail business scenario. Our approach is to preserve the existing BigBench design and introduce a new streaming component that supports two data streaming modes: active and passive. In active mode, the data stream generation and processing happen in parallel, whereas in passive mode, the data stream is pre-generated in advance before the actual stream processing. The stream workload consists of five queries inspired by the existing 30 BigBench queries. To validate the proposed streaming extension, the two streaming modes were implemented and tested using Kafka and Spark Streaming. The experimental results prove the feasibility of our benchmark design. Finally, we outline design challenges and future plans for improving the proposed BigBench extension.
test2:
Abstract:
The use of big data in organizations involves numerous decisions on the business and technical side. While the assessment of technical choices has been studied introducing technical benchmarking approaches, the study of the value of big data and of the impact of business key performance indicators (KPI) on technical choices is still an open problem. The paper discusses a general analysis framework for analyzing big data projects wrt both technical and business performance indicators, and presents the initial results emerging from a first empirical analysis conducted within European companies and research centers within the European DataBench project and the activities of the benchmarking working group of the Big Data Value Association (BDVA). An analysis method is presented, discussing the impact of confidence and support measurements and two directions of analysis are studied: the impact of business KPIs on technical parameters and the study of most important indicators both on the business and on the technical side, for specific industry sectors, with the goal of identifying the most relevant design and assessment criteria.
MORE Information:
Abstract:
Organisations rely on evidence from the Benchmarking domainto provide answers on how their processes are performing.There is extensive information on how and whyto perform technical benchmarks for the specific managementand analytics processes, but there is a lack of objective,evidence-based methods to measure the correlationbetween Big Data Technology (BDT) benchmarks and anorganisations business benchmarks and demonstrate returnon investment (ROI). The DataBench project addresses thissignificant gap in the current benchmarking communitysactivities, by providing certifiable benchmarks and evaluationschemes of BDT performance of high business impactand industrial significance.
MORE Information:
Abstract:
NA
Abstract:
NA
Abstract:
NA
Abstract:
This white paper reports on the current view of the DataBench Toolbox architecture and main functional elements as described in the DataBench deliverable D3.1. The goal of the DataBench Toolbox is to provide a way of reusing existing big data benchmarking efforts under a common framework, providing therefore a way to select, download and homogenize technical and business indicators.
Abstract:
Relating big data business and technical performance indicators
Barbara Pernici, Chiara Francalanci, Angela Geronazzo, Lucia Polidori, Stefano Ray, Leonardo Riva, Politecnico di Milano, Arne Jørgen Berre, SINTEF, Todor Ivanov, Goethe University
itAIS 2018, Pavia
12 October 2018
MORE Information:
Abstract:
https://bit.ly/2DvwFaD / https://bit.ly/2GEKtCW / https://bit.ly/2RTzl77
Abstract:
https://bit.ly/2N2yjVr / https://bit.ly/2TJXU83 / https://bit.ly/2UQB52I / https://bit.ly/2UWO8zP / https://bit.ly/2N2IrgP / https://bit.ly/2V0Dubp
Abstract:
https://bit.ly/2SL5qCq / https://bit.ly/2SyFHhu / https://bit.ly/2Bzlldh / https://bit.ly/2N3rFOw
Abstract:
DataBench press release about the participation at KDD 2018 was published at the BDVA PPP Newsletter (September) and BVDA Website
http://www.big-data-value.eu/the-databench-project-was-presented-at-the-kdd-conference-2018-in-london/
http://www.bdva.eu/sites/default/files/PDF_Newsletter_SEPT_2018.pdf
Abstract:
DataBench press release about the paper "Big Data Key Performance Indicators" for the iTAIS Conference was published at the BDVA PPP Newsletter (September) and BVDA Website
http://www.big-data-value.eu/databench-paper-big-data-key-performance-indicators-accepted-at-itais-conference/
http://www.bdva.eu/sites/default/files/PDF_Newsletter_SEPT_2018.pdf
Abstract:
DataBench press release about the performance of the project during 2018 was published at the BDVA PPP Newsletter (December) and BVDA Website
http://www.big-data-value.eu/databench-project-finishes-its-first-year-of-activity-with-successful-results/
http://www.bdva.eu/sites/default/files/PDF_Newsletter_DEC_2018.pdf
Abstract:
DataBench press release about the creation of a Benchmarking Community was published at the BDVA PPP Newsletter (January) and BVDA Website
http://www.big-data-value.eu/databench-call-for-action-are-you-working-on-projects-related-to-big-data-that-uses-or-develop-benchmarks/?et_fb=1
http://www.bdva.eu/sites/default/files/PDF_Newsletter_January_2019.pdf
Abstract:
First webinar scheduled for the BDVe Webinar Series: DataBench Project
Abstract:
Results of the webinar held by DataBench with Arne Berre and Gabriella Cattaneo
Abstract:
DataBench on the Atos Research and Innovation Newsletter (Events section), for the promotion of the BDVe Series webinar with Arne Berre and Gabriella Cattaneo, and the acceptance of the paper for the itAIS Conference
Abstract:
DataBench on the Atos Research and Innovation Newsletter (Events section), for the promotion of the project's participation at the EBDVF2018
Abstract:
DataBench on the Atos Research and Innovation Newsletter (Events section), for the promotion of the project's participation at the ICT2018
Abstract:
DataBench on the Atos Research and Innovation Newsletter (Projects section), for the promotion of the project's new Scientific Publications Section on the website
Abstract:
Flyer to download the content of the WhitePaper: DataBench Toolbox
Abstract:
Benchmarking for Big Data Applications with the DataBench Framework - presented at the Workshop The Second IEEE International Workshop on Benchmarking, Performance Tuning and Optimization for Big Data Applications (BPOD) in 2018 IEEE International Conference on Big Data
Abstract:
Presence at the booth and distribution of marketing materials, audience engagement, DataBench video display
Abstract:
This IDC Survey Spotlight provides details on the metrics used by European companies to evaluate their Big Data and analytics (BDA) environments and benchmark the impact of BDA solutions already in place. Respondents include European companies with more than 10 employees covering the most important industry sectors.
The analysis is based on data from IDC's EMEA EU DataBench Survey, October 2018.
Abstract:
This IDC Survey Spotlight provides details on the importance for European companies to evaluate their Big Data and Analytics (BDA) environments with the right set of technical performance metrics. Respondents are European companies with more than 10 employees, covering the most important industry sectors.
The analysis is based on data from IDC's EMEA EU DataBench Survey in October 2018.
Abstract:
This IDC Survey presents key findings from IDC's European DataBench Survey on the benefits that European companies are achieving or expect to achieve through the deployment of Big Data and analytics (BDA) solutions. Conducted in October 2018, the survey covered 700 respondents in key industry sectors in 11 European countries, grouped by region (Nordics, South Europe, and Central and Eastern Europe). The survey, as part of a European Commission-funded project, was conducted to understand the current European BDA environment and provide organizations with actionable insights and tools to benchmark themselves on the European market. For more information about the project, visit www.databench.eu.
"European companies are achieving important benefits from BDA deployment, particularly in deepening their understanding of customer behavior. Companies must adopt prescriptive and predictive analytics techniques if they want remain competitive and improve market performance," said Erica Spinoni, research analyst, IDC Europe.
Abstract:
This IDC Survey Spotlight examines how much European organizations have integrated their Big Data analytics (BDA) environments with key Innovation Accelerator technologies. Survey respondents are European companies with more than 10 employees and cover the most important industry sectors.
The analysis is based on data from IDC's European DataBench Survey, October 2018.
MORE Information:
Abstract:
This IDC Survey presents the key findings from IDC's European DataBench Survey on the approach to data management and the benefits derived from the deployment of Big Data analytics (BDA) solutions. The survey, conducted in October 2018, included 700 respondents in Europe and covered the most important industry sectors. The companies sampled are based in the most important Western and Eastern European countries.
The survey, which is part of a project funded by the European Commission, was conducted to understand the current European BDA environment and provide European companies with fruitful insights and tools to benchmark themselves on the European market. Visit https://www.databench.eu/ for more information about the project.
"Real-time solutions for Big Data analytics are powerful tools that can enable businesses to improve their management styles, deepen their understanding of daily routines, and achieve better performance and process improvements," said Erica Spinoni, research analyst, IDC Europe.
MORE Information:
Abstract:
DataBench press release about the results of the survey were published at the BDVA PPP Newsletter (February) and BVDA Website
http://www.big-data-value.eu/databench-project-released-the-results-of-its-survey-on-700-european-companies-focused-on-their-actual-or-planned-use-of-big-data-and-analytics/
http://www.bdva.eu/sites/default/files/PDF_Newsletter_February_2019_compressed.pdf
Abstract:
DataBench project has developed the first of a series of infographics presenting the results of a survey about Big Data and Analytics (BDA) on 700 European companies.
The survey carried out by DataBench between September and October 2018 investigated the actual or planned use of BDA by 700 European businesses in 11 EU Member States and identified the relevance of business KPIs for BDA users. The detailed analysis is presented in the report D2.2 of DataBench Project.
The infographic presents the main highlights of the survey concerning:
Level of adoption of BDA solutions by EU businesses from different Industries and Business Areas
Business goals driving the adoption of this type of technologies
Most relevant KPI categories for measuring the business impact
Achievements and expected benefits for using BDA solutions
Current use of analytic techniques
Current level of Big Data skills gap
https://www.databench.eu/wp-content/uploads/2019/03/databench_infographic1.pdf
https://www.databench.eu/big-data-analytics-big-opportunities-for-eu-companies-have-a-look-at-the-first-databench-infographic/
Abstract:
DataBench in the Atos Research and Innovation Newsletter (Projects section), about the General Meeting held by Atos in Madrid
Abstract:
DataBench on the Atos Research and Innovation Newsletter (Events section), to communicate the participation of DataBench at the event #OpenExpoEurope in Madrid.
Abstract:
https://twitter.com/AtosES/status/1106126834205421568 | March 14 (Twitter)
https://twitter.com/AtosES/status/1106474123025399808 | March 15 (Twitter)
https://twitter.com/AtosES/status/1103660852987740160 | March 7 (Twitter)
https://twitter.com/AtosES/status/1103932371055722496 | March 8 (Twitter)
https://www.linkedin.com/feed/update/urn:li:activity:6511537689295228928 | March 14 (LinkedIn)
Abstract:
https://twitter.com/AtosES/status/1106109974307049472 |
https://twitter.com/AtosES/status/1103225970230726657
Abstract:
DataBench press release about the infographic developed with the main highlights of the survey was published at the BDVA PPP Newsletter (March) and BDVA Website.
http://www.big-data-value.eu/databench-infographic-based-on-a-survey-on-700-european-companies/
Abstract:
BigBench standardized as TPCx-BB is a popular application benchmark that targets Big Data storage and processing systems. BigBench V2 addresses some of the BigBench limitations by introducing a new simplified data model, semi-structured web logs in JSON file format and new queries mandating late binding. However, it still covers only batch processing workloads and the Big Data velocity characteristic is not addressed. This work extends the BigBench V2 benchmark with a data streaming component that simulates typical statistical and predictive analytics queries in a retail business scenario. Our approach is to preserve the existing BigBench design and introduce a new streaming component that supports two data streaming modes: active and passive. In active mode, the data stream generation and processing happen in parallel, whereas in passive mode, the data stream is pre-generated in advance before the actual stream processing. The stream workload consists of five queries inspired by the existing 30 BigBench queries. To validate the proposed streaming extension, the two streaming modes were implemented and tested using Kafka and Spark Streaming. The experimental results prove the feasibility of our benchmark design. Finally, we outline design challenges and future plans for improving the proposed BigBench extension.
Abstract:
DataBench press release about the call for industry case studies was published in the BDVA PPP Newsletter (May) and BVDA Website
Abstract:
DataBench (Evidence Based Big Data Benchmarking to Improve Business Performance) is a H2020 project started in January 2018 with the goal of providing indicators and metrics to evaluate benchmarks available for assessing Big Data technologies, including data analysis systems. Existing benchmarks range from micro-benchmarks, focusing on a specific technological aspect, to end-to-end benchmarks to evaluate business intelligence applications as a whole and identify maximum workloads and possible weaknesses. The talk will present the indicators which have been studied in the project and include Business features, Big Data Applications Features, Platform and Architecture Features, and Technical Benchmark Features. The results of an initial survey of use of big data technologies in Europe with approximately 700 respondents will be illustrated and the requirements for different industry sectors analyzed. A specific focus on evaluating big data technologies for scientific data will be presented and discussed, with the aim of gathering further requirements for benchmarking tools for scientific applications, and in particular in the combustion. Some early results and further requirements for innovative cooperative information systems for storing, curating, managing, and analyzing experiments and models in the combustion domain will also be illustrated. DataBench web site: https://www.databench.eu/
MORE Information:
Abstract:
Moderation of KDD 2018 Project Showcase
Abstract:
n/a
Abstract:
Cross-lingual Real-Time Global Media Monitoring
Abstract:
At times our society feels like a runaway train. Technologies in artificial intelligence, biotech, nanotech, to name only a few fields, are developing at an extreme pace, but are not accompanied by a strategic analysis of their impact not only on our daily lives but the whole of humanity – on social relations, on our emotional and biological selves, as well as on our legal systems and regulatory frameworks. The infrastructure for such fundamental changes is not in place.
The panel will explore cutting edge technological advances and how they will affect our lives and the human race as a whole. What changes can we foresee in the coming decades? How can we ensure that they will lead us to a better existence and that we use technology to improve our lives and not to perpetuate the cycles of global violence and wars that mark human history?
Abstract:
Can data science help reduce police violence and misconduct? Can it help increase retention of patients in care? Can it help prevent children from getting lead poisoning? Can it help cities better target limited resources to improve lives of citizens? Were all aware of the hype around data science and related buzzwords right now but turning this hype into social impact takes cross-disciplinary training, teams, and methods.
Abstract:
Benchmarking for Digital Platforms with Big Data, IoT, AI, Cloud, HPC and CyberSecurity is being introduced based on European activities in this area, in particular related to the DataBench project and work in BDVA, AIOTI and ECSO.
Abstract:
Benchmarking session at EBDVF'2018
Introduction - Arne Berre/Axel Ngonga
Designing Big Data Benchmarks - Irini Fundulaki
LDBC - Peter Boncz
DataBench - Gabriella Cattaneo/Tomas P. Lobo
Holistic Benchmarking - Axel Ngonga/Gayane Sedrakyan
Benchmarking as a service - Pavel Smirnov (AGT)
The EU Big Data Inducement price challenge - Kimmo Rossi (EC)
Presentation of solutions of winners of the EU Big Data Inducement Prize
Summary and Discussion
Abstract:
A contribution for June PPP Newsletter was sent to promote the new page of DataBench in LinkedIn.
Abstract:
The contextual complexity around big data is increasing exponentially. Governance and regulation, data volume and data variety, the velocity of transmission and computation, and the potential for adoption of open source technologies are a few of the many challenges that companies are facing in the Big Data environment. In addition, digital transformation (DX) in Europe is starting to play a significant role as a basis to enhance the European market position and for a future gain in competitiveness.
MORE Information:
Abstract:
Distributed big data processing and analytics applications demand a comprehensive end-to-end architecture stack consisting of big data technologies. However, there are many possible architecture patterns (e.g. Lambda, Kappa or Pipeline architectures) to choose from when implementing the application requirements. A big data technology in isolation may be best performing for a particular application, but its performance in connection with other technologies depends on the connectors and the environment. Similarly, existing big data benchmarks evaluate the performance of different technologies in isolation, but no work has been done on benchmarking big data architecture stacks as a whole. For example, BigBench (TPCx-BB) may be used to evaluate the performance of Spark, but is it applicable to PySpark or to Spark with Kafka stack as well? What is the impact of having different programming environments and/or any other technology like Spark? This vision paper proposes a new category of benchmark, called ABench, to fill this gap and discusses key aspects necessary for the performance evaluation of different big data architecture stacks.
MORE Information:
Abstract:
In the Big Data era, stream processing has become a common requirement for many data-intensive applications. This has lead to many advances in the development and adaption of large scale streaming systems. Spark and Flink have become a popular choice for many developers as they combine both batch and streaming capabilities in a single system. However, introducing the Spark Structured Streaming in version 2.0 opened up completely new features for SparkSQL, which are alternatively only available in Apache Calcite. This work focuses on the new Spark Structured Streaming and analyses it by diving into its internal functionalities. With the help of a micro-benchmark consisting of streaming queries, we perform initial experiments evaluating the technology. Our results show that Spark Structured Streaming is able to run multiple queries successfully in parallel on data with changing velocity and volume sizes.
MORE Information:
Abstract:
BigBench standardized as TPCx-BB is a popular application benchmark that targets Big Data storage and processing systems. BigBench V2 addresses some of the BigBench limitations by introducing a new simplified data model, semi-structured web logs in JSON file format and new queries mandating late binding. However, it still covers only batch processing workloads and the Big Data velocity characteristic is not addressed. This work extends the BigBench V2 benchmark with a data streaming component that simulates typical statistical and predictive analytics queries in a retail business scenario. Our approach is to preserve the existing BigBench design and introduce a new streaming component that supports two data streaming modes: active and passive. In active mode, the data stream generation and processing happen in parallel, whereas in passive mode, the data stream is pre-generated in advance before the actual stream processing. The stream workload consists of five queries inspired by the existing 30 BigBench queries. To validate the proposed streaming extension, the two streaming modes were implemented and tested using Kafka and Spark Streaming. The experimental results prove the feasibility of our benchmark design. Finally, we outline design challenges and future plans for improving the proposed BigBench extension.
MORE Information:
Abstract:
This chapter reviews the evolution of the analytics benchmarks and their current state today (as of 2017). It starts overview of the most relevant benchmarking organizations their benchmark standards and outlines the latest benchmark development and initiatives targeting the emerging Big Data Analytics systems. Last but not least the typical benchmark components are described as well as the different goals that these benchmarks try to achieve.
MORE Information:
Abstract:
In the era of Big Data and AI, it is challenging to know all technical and business advantages of the emerging technologies. The goal of DataBench is to design a benchmarking process helping organizations developing Big Data Technologies (BDT) to reach for excellence and constantly improve their performance, by measuring their technology development activity against parameters of high business relevance. This paper focuses on the internals of the DataBench framework and presents our methodological workflow and framework architecture.
MORE Information:
Abstract:
DataBench promotion @IDC booth
Abstract:
The proliferation of big data technology and faster computing systems led to pervasions of AI based solutions in our life. There is need to understand how to benchmark systems used to build AI based solutions that have a complex pipeline of pre-processing, statistical analysis, machine learning and deep learning on data to build prediction models. Solution architects, engineers and researchers may use open-source technology or proprietary systems based on desired performance requirements. The performance metrics may be data pre-processing time, model training time and model inference time. We do not see a single benchmark answering all questions of solution architects and researchers. This tutorial covers both practical and research questions on relevant Big Data and Analytics benchmarks.
MORE Information:
Abstract:
DataBench project goal is to design a benchmarking process helping European organizations developing Big Data Technologies to reach for excellence and improve their performance, by measuring their technology development activity against parameters of high business relevance. DataBench is investigating existing Big Data benchmarking tools and projects, identifying the main gaps and developing a robust set of metrics to compare technical results coming from those tools that will be available on the main outcome of the project the DataBench Toolbox.
In this 45 minute session titled Big Data – Benchmark your way to excellent business performance our speakers and representative of the DataBench Project, Gabriella Cattaneo and Erica Spinoni from IDC, presented the results of the research on Big Data business impacts by European industries. Listen to this webinar to understand how different industries implement Big Data Technologies and benchmark activities, for measuring business-related KPIs such as revenue growth, profitability and cost savings, customer satisfaction, among others, based on a survey conducted by IDC to 700 EU-industry-representative companies.
Abstract:
66 Likes 8 Media Clicks 26 Shares 25 Detail Expands 2 Comments 100% Organic Engagement
Abstract:
Promotion of Virtual BenchLearning (BDVA PPP Mailing List)
Abstract:
Publication of article on News Section BIG DATA – BENCHMARK YOUR WAY TO EXCELLENT BUSINESS PERFORMANCE
Abstract:
Promotion of Virtual BenchLearning (IDC Mailing List)
Abstract:
Promotion of DataBench Virtual BenchLearning -Atos Internal Newsletter Iberia (1100 contacts)Atos Research and Innovation Internal Newsletter (180 contacts)
Abstract:
In this 1 hour session titled Success stories on Big Data & Analytics Use Cases + DataBench Toolbox our speakers and participants of the DataBench Project, Chiara Francalanci from Politecnico di Milano and Tomás Pariente from Atos Research and Innovation, presented the most relevant use cases and success stories on Retail and Manufacturing sectors implementing Big Data, Analytics and Benchmarking tools. The session included a demonstration of the main outcome of the project, the DataBench Toolbox, which allows different types of users from both technical and business roles, to search, select and deploy Big Data benchmarking tools to generate unified technical metrics and derive business KPIs.
Abstract:
To update metrics
Abstract:
Publication of article on News Section DID YOU MISS THE 1ST VIRTUAL BENCHLEARNING BY DATABENCH? CHECK THE RECORDING AND THE PRESENTATIONS NOW!
Abstract:
Publication of article on News Section
SUCCESS STORIES ON BIG DATA & ANALYTICS USE CASES AND DATABENCH TOOLBOX
Abstract:
Publication of article on News Section
DID YOU MISS THE 2ND VIRTUAL BENCHLEARNING BY DATABENCH? CHECK THE RECORDING AND THE PRESENTATIONS NOW!
Abstract:
Promotion of Virtual BenchLearning (BDVA PPP Mailing List)
Abstract:
Promotion of 2nd DataBench Virtual BenchLearning -Atos Internal Newsletter Iberia (1100 contacts)Atos Research and Innovation Internal Newsletter (180 contacts)
Abstract:
During this 1-hour session organised in collaboration with BDVA, different speakers from DataBench project provided the audience with a framework and tools to assess the performance and impact of Big Data and AI technologies from the technical perspective, by providing real insights coming from DataBench and other projects active in the benchmarking domain in various industrial sectors.
In addition, representatives from other projects part of the BDV PPP such as DeepHealth and I-BiDaaS participated to share the challenges and opportunities they have identified on the use of Big Data, Analytics, AI.
The webinar featured:
An introductory session describing the need for DataBench in Big Data and AI centric analytics and its use in comparing the performance indicators that are important to organisations using them.
A demonstration of the DataBench Toolbox and the advantages it brings for organisations that need to assess their data analytics processes.
A discussion with other BDV PPP projects about their needs in terms of benchmarking big data.
Abstract:
Publication of article on News Section
VIRTUAL BENCHLEARNING: ASSESSING THE PERFORMANCE AND IMPACT OF BIG DATA, ANALYTICS AND AI
Abstract:
Promotion of 3rd Virtual BenchLearning (IDC Mailing List)
Abstract:
Promotion of 3rd Virtual BenchLearning (BDVA PPP Mailing List)
Abstract:
Presentation of research article: "CoreBigBench: Benchmarking big data core operations" Authors: Todor Ivanov, Ahmad Ghazal, Alain Crolotte, Pekka Kostamaa and Yoseph Ghazal
Abstract:
Significant effort was put into big data benchmarking with focus on end-to-end applications. While covering basic functionalities implicitly, the details of the individual contributions to the overall performance are hidden. As a result, end-to-end benchmarks could be biased toward certain basic functions. Micro-benchmarks are more explicit at covering basic functionalities but they are usually targeted at some highly specialized functions. In this paper we present CoreBigBench, a benchmark that focuses on the most common big data engines/platforms functionalities like scans, two way joins, common UDF execution and more. These common functionalities are benchmarked over relational and key-value data models which covers majority of data models. The benchmark consists of 22 queries applied to sales data and key-value web logs covering the basic functionalities. We ran CoreBigBench on Hive as a proof of concept and verified that the benchmark is easy to deploy and collected performance data. Finally, we believe that CoreBigBench is a good fit for commercial big data engines performance testing focused on basic engine functionalities not covered in end-to-end benchmarks.
MORE Information:
Abstract:
Significant effort was put into big data benchmarking with focus on end-to-end applications. While covering basic functionalities implicitly, the details of the individual contributions to the overall performance are hidden. As a result, end-to-end benchmarks could be biased toward certain basic functions. Micro-benchmarks are more explicit at covering basic functionalities but they are usually targeted at some highly specialized functions. In this paper we present CoreBigBench, a benchmark that focuses on the most common big data engines/platforms functionalities like scans, two way joins, common UDF execution and more. These common functionalities are benchmarked over relational and key-value data models which covers majority of data models. The benchmark consists of 22 queries applied to sales data and key-value web logs covering the basic functionalities. We ran CoreBigBench on Hive as a proof of concept and verified that the benchmark is easy to deploy and collected performance data. Finally, we believe that CoreBigBench is a good fit for commercial big data engines performance testing focused on basic engine functionalities not covered in end-to-end benchmarks.
MORE Information:
Abstract:
This is the second edition of this paper about Big Data challenges in the Smart Manufacturing Industry. In the 2018 edition, the focus was on the alignment between the BDVA Reference Model and the EFFRA Strategic Research and Innovation Agenda through analysis of their respective reference architectures and alignment of their respective technical challenges. The main outcome of the 2018 activities was the identification 56 Big Data technical challenges in the three Manufacturing Grand Scenarios of Smart Factory, Smart Product and Smart Supply Chain. Now in the 2020 edition, a twofold approach was chosen: on the one side (continuity evolutionary approach), we are updating the previous content, not just of the technology but also of the legal-sociobusiness landscape. On the other side, H2020 is in its last period and we need to think of more disruptive and revolutionary topics to be implemented along the future Multiannual Financial Framework (MFF 2021-2027) and respectively in both Digital Europe and Horizon Europe programs. This second approach will continue in the work of the SMI group on elaborating the bridge between H2020 and Horizon / Digital Europe.
MORE Information:
Abstract:
In the era of Big Data and AI, it is challenging to know alltechnical and business advantages of the emerging technologies. The goalof DataBench is to design a benchmarking process helping organizationsdeveloping Big Data Technologies (BDT) to reach for excellence andconstantly improve their performance, by measuring their technologydevelopment activity against parameters of high business relevance. Thispaper focuses on the internals of the DataBench framework and presentsour methodological workow and framework architecture.
MORE Information:
Publications
Abstract:
The use of big data in organizations involves numerous decisions on the business and technical side. While the assessment of technical choices has been studied introducing technical benchmarking approaches, the study of the value of big data and of the impact of business key performance indicators (KPI) on technical choices is still an open problem. The paper discusses a general analysis framework for analyzing big data projects wrt both technical and business performance indicators, and presents the initial results emerging from a first empirical analysis conducted within European companies and research centers within the European DataBench project and the activities of the benchmarking working group of the Big Data Value Association (BDVA). An analysis method is presented, discussing the impact of confidence and support measurements and two directions of analysis are studied: the impact of business KPIs on technical parameters and the study of most important indicators both on the business and on the technical side, for specific industry sectors, with the goal of identifying the most relevant design and assessment criteria.
MORE Information:
Abstract:
Organisations rely on evidence from the Benchmarking domainto provide answers on how their processes are performing.There is extensive information on how and whyto perform technical benchmarks for the specific managementand analytics processes, but there is a lack of objective,evidence-based methods to measure the correlationbetween Big Data Technology (BDT) benchmarks and anorganisations business benchmarks and demonstrate returnon investment (ROI). The DataBench project addresses thissignificant gap in the current benchmarking communitysactivities, by providing certifiable benchmarks and evaluationschemes of BDT performance of high business impactand industrial significance.
MORE Information:
Abstract:
NA
Abstract:
NA
Abstract:
NA
Abstract:
This white paper reports on the current view of the DataBench Toolbox architecture and main functional elements as described in the DataBench deliverable D3.1. The goal of the DataBench Toolbox is to provide a way of reusing existing big data benchmarking efforts under a common framework, providing therefore a way to select, download and homogenize technical and business indicators.
Abstract:
Relating big data business and technical performance indicators
Barbara Pernici, Chiara Francalanci, Angela Geronazzo, Lucia Polidori, Stefano Ray, Leonardo Riva, Politecnico di Milano, Arne Jørgen Berre, SINTEF, Todor Ivanov, Goethe University
itAIS 2018, Pavia
12 October 2018
MORE Information:
Abstract:
https://bit.ly/2DvwFaD / https://bit.ly/2GEKtCW / https://bit.ly/2RTzl77
Abstract:
https://bit.ly/2N2yjVr / https://bit.ly/2TJXU83 / https://bit.ly/2UQB52I / https://bit.ly/2UWO8zP / https://bit.ly/2N2IrgP / https://bit.ly/2V0Dubp
Abstract:
https://bit.ly/2SL5qCq / https://bit.ly/2SyFHhu / https://bit.ly/2Bzlldh / https://bit.ly/2N3rFOw
Abstract:
DataBench press release about the participation at KDD 2018 was published at the BDVA PPP Newsletter (September) and BVDA Website
http://www.big-data-value.eu/the-databench-project-was-presented-at-the-kdd-conference-2018-in-london/
http://www.bdva.eu/sites/default/files/PDF_Newsletter_SEPT_2018.pdf
Abstract:
DataBench press release about the paper "Big Data Key Performance Indicators" for the iTAIS Conference was published at the BDVA PPP Newsletter (September) and BVDA Website
http://www.big-data-value.eu/databench-paper-big-data-key-performance-indicators-accepted-at-itais-conference/
http://www.bdva.eu/sites/default/files/PDF_Newsletter_SEPT_2018.pdf
Abstract:
DataBench press release about the performance of the project during 2018 was published at the BDVA PPP Newsletter (December) and BVDA Website
http://www.big-data-value.eu/databench-project-finishes-its-first-year-of-activity-with-successful-results/
http://www.bdva.eu/sites/default/files/PDF_Newsletter_DEC_2018.pdf
Abstract:
DataBench press release about the creation of a Benchmarking Community was published at the BDVA PPP Newsletter (January) and BVDA Website
http://www.big-data-value.eu/databench-call-for-action-are-you-working-on-projects-related-to-big-data-that-uses-or-develop-benchmarks/?et_fb=1
http://www.bdva.eu/sites/default/files/PDF_Newsletter_January_2019.pdf
Abstract:
First webinar scheduled for the BDVe Webinar Series: DataBench Project
Abstract:
Results of the webinar held by DataBench with Arne Berre and Gabriella Cattaneo
Abstract:
DataBench on the Atos Research and Innovation Newsletter (Events section), for the promotion of the BDVe Series webinar with Arne Berre and Gabriella Cattaneo, and the acceptance of the paper for the itAIS Conference
Abstract:
DataBench on the Atos Research and Innovation Newsletter (Events section), for the promotion of the project's participation at the EBDVF2018
Abstract:
DataBench on the Atos Research and Innovation Newsletter (Events section), for the promotion of the project's participation at the ICT2018
Abstract:
DataBench on the Atos Research and Innovation Newsletter (Projects section), for the promotion of the project's new Scientific Publications Section on the website
Abstract:
Flyer to download the content of the WhitePaper: DataBench Toolbox
Abstract:
Benchmarking for Big Data Applications with the DataBench Framework - presented at the Workshop The Second IEEE International Workshop on Benchmarking, Performance Tuning and Optimization for Big Data Applications (BPOD) in 2018 IEEE International Conference on Big Data
Abstract:
Presence at the booth and distribution of marketing materials, audience engagement, DataBench video display
Abstract:
This IDC Survey Spotlight provides details on the metrics used by European companies to evaluate their Big Data and analytics (BDA) environments and benchmark the impact of BDA solutions already in place. Respondents include European companies with more than 10 employees covering the most important industry sectors.
The analysis is based on data from IDC's EMEA EU DataBench Survey, October 2018.
Abstract:
This IDC Survey Spotlight provides details on the importance for European companies to evaluate their Big Data and Analytics (BDA) environments with the right set of technical performance metrics. Respondents are European companies with more than 10 employees, covering the most important industry sectors.
The analysis is based on data from IDC's EMEA EU DataBench Survey in October 2018.
Abstract:
This IDC Survey presents key findings from IDC's European DataBench Survey on the benefits that European companies are achieving or expect to achieve through the deployment of Big Data and analytics (BDA) solutions. Conducted in October 2018, the survey covered 700 respondents in key industry sectors in 11 European countries, grouped by region (Nordics, South Europe, and Central and Eastern Europe). The survey, as part of a European Commission-funded project, was conducted to understand the current European BDA environment and provide organizations with actionable insights and tools to benchmark themselves on the European market. For more information about the project, visit www.databench.eu.
"European companies are achieving important benefits from BDA deployment, particularly in deepening their understanding of customer behavior. Companies must adopt prescriptive and predictive analytics techniques if they want remain competitive and improve market performance," said Erica Spinoni, research analyst, IDC Europe.
Abstract:
This IDC Survey Spotlight examines how much European organizations have integrated their Big Data analytics (BDA) environments with key Innovation Accelerator technologies. Survey respondents are European companies with more than 10 employees and cover the most important industry sectors.
The analysis is based on data from IDC's European DataBench Survey, October 2018.
MORE Information:
Abstract:
This IDC Survey presents the key findings from IDC's European DataBench Survey on the approach to data management and the benefits derived from the deployment of Big Data analytics (BDA) solutions. The survey, conducted in October 2018, included 700 respondents in Europe and covered the most important industry sectors. The companies sampled are based in the most important Western and Eastern European countries.
The survey, which is part of a project funded by the European Commission, was conducted to understand the current European BDA environment and provide European companies with fruitful insights and tools to benchmark themselves on the European market. Visit https://www.databench.eu/ for more information about the project.
"Real-time solutions for Big Data analytics are powerful tools that can enable businesses to improve their management styles, deepen their understanding of daily routines, and achieve better performance and process improvements," said Erica Spinoni, research analyst, IDC Europe.
MORE Information:
Abstract:
DataBench press release about the results of the survey were published at the BDVA PPP Newsletter (February) and BVDA Website
http://www.big-data-value.eu/databench-project-released-the-results-of-its-survey-on-700-european-companies-focused-on-their-actual-or-planned-use-of-big-data-and-analytics/
http://www.bdva.eu/sites/default/files/PDF_Newsletter_February_2019_compressed.pdf
Abstract:
DataBench project has developed the first of a series of infographics presenting the results of a survey about Big Data and Analytics (BDA) on 700 European companies.
The survey carried out by DataBench between September and October 2018 investigated the actual or planned use of BDA by 700 European businesses in 11 EU Member States and identified the relevance of business KPIs for BDA users. The detailed analysis is presented in the report D2.2 of DataBench Project.
The infographic presents the main highlights of the survey concerning:
Level of adoption of BDA solutions by EU businesses from different Industries and Business Areas
Business goals driving the adoption of this type of technologies
Most relevant KPI categories for measuring the business impact
Achievements and expected benefits for using BDA solutions
Current use of analytic techniques
Current level of Big Data skills gap
https://www.databench.eu/wp-content/uploads/2019/03/databench_infographic1.pdf
https://www.databench.eu/big-data-analytics-big-opportunities-for-eu-companies-have-a-look-at-the-first-databench-infographic/
Abstract:
DataBench in the Atos Research and Innovation Newsletter (Projects section), about the General Meeting held by Atos in Madrid
Abstract:
DataBench on the Atos Research and Innovation Newsletter (Events section), to communicate the participation of DataBench at the event #OpenExpoEurope in Madrid.
Abstract:
https://twitter.com/AtosES/status/1106126834205421568 | March 14 (Twitter)
https://twitter.com/AtosES/status/1106474123025399808 | March 15 (Twitter)
https://twitter.com/AtosES/status/1103660852987740160 | March 7 (Twitter)
https://twitter.com/AtosES/status/1103932371055722496 | March 8 (Twitter)
https://www.linkedin.com/feed/update/urn:li:activity:6511537689295228928 | March 14 (LinkedIn)
Abstract:
https://twitter.com/AtosES/status/1106109974307049472 |
https://twitter.com/AtosES/status/1103225970230726657
Abstract:
DataBench press release about the infographic developed with the main highlights of the survey was published at the BDVA PPP Newsletter (March) and BDVA Website.
http://www.big-data-value.eu/databench-infographic-based-on-a-survey-on-700-european-companies/
Abstract:
BigBench standardized as TPCx-BB is a popular application benchmark that targets Big Data storage and processing systems. BigBench V2 addresses some of the BigBench limitations by introducing a new simplified data model, semi-structured web logs in JSON file format and new queries mandating late binding. However, it still covers only batch processing workloads and the Big Data velocity characteristic is not addressed. This work extends the BigBench V2 benchmark with a data streaming component that simulates typical statistical and predictive analytics queries in a retail business scenario. Our approach is to preserve the existing BigBench design and introduce a new streaming component that supports two data streaming modes: active and passive. In active mode, the data stream generation and processing happen in parallel, whereas in passive mode, the data stream is pre-generated in advance before the actual stream processing. The stream workload consists of five queries inspired by the existing 30 BigBench queries. To validate the proposed streaming extension, the two streaming modes were implemented and tested using Kafka and Spark Streaming. The experimental results prove the feasibility of our benchmark design. Finally, we outline design challenges and future plans for improving the proposed BigBench extension.
Abstract:
DataBench press release about the call for industry case studies was published in the BDVA PPP Newsletter (May) and BVDA Website
Abstract:
DataBench (Evidence Based Big Data Benchmarking to Improve Business Performance) is a H2020 project started in January 2018 with the goal of providing indicators and metrics to evaluate benchmarks available for assessing Big Data technologies, including data analysis systems. Existing benchmarks range from micro-benchmarks, focusing on a specific technological aspect, to end-to-end benchmarks to evaluate business intelligence applications as a whole and identify maximum workloads and possible weaknesses. The talk will present the indicators which have been studied in the project and include Business features, Big Data Applications Features, Platform and Architecture Features, and Technical Benchmark Features. The results of an initial survey of use of big data technologies in Europe with approximately 700 respondents will be illustrated and the requirements for different industry sectors analyzed. A specific focus on evaluating big data technologies for scientific data will be presented and discussed, with the aim of gathering further requirements for benchmarking tools for scientific applications, and in particular in the combustion. Some early results and further requirements for innovative cooperative information systems for storing, curating, managing, and analyzing experiments and models in the combustion domain will also be illustrated. DataBench web site: https://www.databench.eu/
MORE Information:
Abstract:
Moderation of KDD 2018 Project Showcase
Abstract:
n/a
Abstract:
Cross-lingual Real-Time Global Media Monitoring
Abstract:
At times our society feels like a runaway train. Technologies in artificial intelligence, biotech, nanotech, to name only a few fields, are developing at an extreme pace, but are not accompanied by a strategic analysis of their impact not only on our daily lives but the whole of humanity – on social relations, on our emotional and biological selves, as well as on our legal systems and regulatory frameworks. The infrastructure for such fundamental changes is not in place.
The panel will explore cutting edge technological advances and how they will affect our lives and the human race as a whole. What changes can we foresee in the coming decades? How can we ensure that they will lead us to a better existence and that we use technology to improve our lives and not to perpetuate the cycles of global violence and wars that mark human history?
Abstract:
Can data science help reduce police violence and misconduct? Can it help increase retention of patients in care? Can it help prevent children from getting lead poisoning? Can it help cities better target limited resources to improve lives of citizens? Were all aware of the hype around data science and related buzzwords right now but turning this hype into social impact takes cross-disciplinary training, teams, and methods.
Abstract:
Benchmarking for Digital Platforms with Big Data, IoT, AI, Cloud, HPC and CyberSecurity is being introduced based on European activities in this area, in particular related to the DataBench project and work in BDVA, AIOTI and ECSO.
Abstract:
Benchmarking session at EBDVF'2018
Introduction - Arne Berre/Axel Ngonga
Designing Big Data Benchmarks - Irini Fundulaki
LDBC - Peter Boncz
DataBench - Gabriella Cattaneo/Tomas P. Lobo
Holistic Benchmarking - Axel Ngonga/Gayane Sedrakyan
Benchmarking as a service - Pavel Smirnov (AGT)
The EU Big Data Inducement price challenge - Kimmo Rossi (EC)
Presentation of solutions of winners of the EU Big Data Inducement Prize
Summary and Discussion
Abstract:
A contribution for June PPP Newsletter was sent to promote the new page of DataBench in LinkedIn.
Abstract:
The contextual complexity around big data is increasing exponentially. Governance and regulation, data volume and data variety, the velocity of transmission and computation, and the potential for adoption of open source technologies are a few of the many challenges that companies are facing in the Big Data environment. In addition, digital transformation (DX) in Europe is starting to play a significant role as a basis to enhance the European market position and for a future gain in competitiveness.
MORE Information:
Abstract:
Distributed big data processing and analytics applications demand a comprehensive end-to-end architecture stack consisting of big data technologies. However, there are many possible architecture patterns (e.g. Lambda, Kappa or Pipeline architectures) to choose from when implementing the application requirements. A big data technology in isolation may be best performing for a particular application, but its performance in connection with other technologies depends on the connectors and the environment. Similarly, existing big data benchmarks evaluate the performance of different technologies in isolation, but no work has been done on benchmarking big data architecture stacks as a whole. For example, BigBench (TPCx-BB) may be used to evaluate the performance of Spark, but is it applicable to PySpark or to Spark with Kafka stack as well? What is the impact of having different programming environments and/or any other technology like Spark? This vision paper proposes a new category of benchmark, called ABench, to fill this gap and discusses key aspects necessary for the performance evaluation of different big data architecture stacks.
MORE Information:
Abstract:
In the Big Data era, stream processing has become a common requirement for many data-intensive applications. This has lead to many advances in the development and adaption of large scale streaming systems. Spark and Flink have become a popular choice for many developers as they combine both batch and streaming capabilities in a single system. However, introducing the Spark Structured Streaming in version 2.0 opened up completely new features for SparkSQL, which are alternatively only available in Apache Calcite. This work focuses on the new Spark Structured Streaming and analyses it by diving into its internal functionalities. With the help of a micro-benchmark consisting of streaming queries, we perform initial experiments evaluating the technology. Our results show that Spark Structured Streaming is able to run multiple queries successfully in parallel on data with changing velocity and volume sizes.
MORE Information:
Abstract:
BigBench standardized as TPCx-BB is a popular application benchmark that targets Big Data storage and processing systems. BigBench V2 addresses some of the BigBench limitations by introducing a new simplified data model, semi-structured web logs in JSON file format and new queries mandating late binding. However, it still covers only batch processing workloads and the Big Data velocity characteristic is not addressed. This work extends the BigBench V2 benchmark with a data streaming component that simulates typical statistical and predictive analytics queries in a retail business scenario. Our approach is to preserve the existing BigBench design and introduce a new streaming component that supports two data streaming modes: active and passive. In active mode, the data stream generation and processing happen in parallel, whereas in passive mode, the data stream is pre-generated in advance before the actual stream processing. The stream workload consists of five queries inspired by the existing 30 BigBench queries. To validate the proposed streaming extension, the two streaming modes were implemented and tested using Kafka and Spark Streaming. The experimental results prove the feasibility of our benchmark design. Finally, we outline design challenges and future plans for improving the proposed BigBench extension.
MORE Information:
Abstract:
This chapter reviews the evolution of the analytics benchmarks and their current state today (as of 2017). It starts overview of the most relevant benchmarking organizations their benchmark standards and outlines the latest benchmark development and initiatives targeting the emerging Big Data Analytics systems. Last but not least the typical benchmark components are described as well as the different goals that these benchmarks try to achieve.
MORE Information:
Abstract:
In the era of Big Data and AI, it is challenging to know all technical and business advantages of the emerging technologies. The goal of DataBench is to design a benchmarking process helping organizations developing Big Data Technologies (BDT) to reach for excellence and constantly improve their performance, by measuring their technology development activity against parameters of high business relevance. This paper focuses on the internals of the DataBench framework and presents our methodological workflow and framework architecture.
MORE Information:
Abstract:
DataBench promotion @IDC booth
Abstract:
The proliferation of big data technology and faster computing systems led to pervasions of AI based solutions in our life. There is need to understand how to benchmark systems used to build AI based solutions that have a complex pipeline of pre-processing, statistical analysis, machine learning and deep learning on data to build prediction models. Solution architects, engineers and researchers may use open-source technology or proprietary systems based on desired performance requirements. The performance metrics may be data pre-processing time, model training time and model inference time. We do not see a single benchmark answering all questions of solution architects and researchers. This tutorial covers both practical and research questions on relevant Big Data and Analytics benchmarks.
MORE Information:
Abstract:
DataBench project goal is to design a benchmarking process helping European organizations developing Big Data Technologies to reach for excellence and improve their performance, by measuring their technology development activity against parameters of high business relevance. DataBench is investigating existing Big Data benchmarking tools and projects, identifying the main gaps and developing a robust set of metrics to compare technical results coming from those tools that will be available on the main outcome of the project the DataBench Toolbox.
In this 45 minute session titled Big Data – Benchmark your way to excellent business performance our speakers and representative of the DataBench Project, Gabriella Cattaneo and Erica Spinoni from IDC, presented the results of the research on Big Data business impacts by European industries. Listen to this webinar to understand how different industries implement Big Data Technologies and benchmark activities, for measuring business-related KPIs such as revenue growth, profitability and cost savings, customer satisfaction, among others, based on a survey conducted by IDC to 700 EU-industry-representative companies.
Abstract:
66 Likes 8 Media Clicks 26 Shares 25 Detail Expands 2 Comments 100% Organic Engagement
Abstract:
Promotion of Virtual BenchLearning (BDVA PPP Mailing List)
Abstract:
Publication of article on News Section BIG DATA – BENCHMARK YOUR WAY TO EXCELLENT BUSINESS PERFORMANCE
Abstract:
Promotion of Virtual BenchLearning (IDC Mailing List)
Abstract:
Promotion of DataBench Virtual BenchLearning -Atos Internal Newsletter Iberia (1100 contacts)Atos Research and Innovation Internal Newsletter (180 contacts)
Abstract:
In this 1 hour session titled Success stories on Big Data & Analytics Use Cases + DataBench Toolbox our speakers and participants of the DataBench Project, Chiara Francalanci from Politecnico di Milano and Tomás Pariente from Atos Research and Innovation, presented the most relevant use cases and success stories on Retail and Manufacturing sectors implementing Big Data, Analytics and Benchmarking tools. The session included a demonstration of the main outcome of the project, the DataBench Toolbox, which allows different types of users from both technical and business roles, to search, select and deploy Big Data benchmarking tools to generate unified technical metrics and derive business KPIs.
Abstract:
To update metrics
Abstract:
Publication of article on News Section DID YOU MISS THE 1ST VIRTUAL BENCHLEARNING BY DATABENCH? CHECK THE RECORDING AND THE PRESENTATIONS NOW!
Abstract:
Publication of article on News Section
SUCCESS STORIES ON BIG DATA & ANALYTICS USE CASES AND DATABENCH TOOLBOX
Abstract:
Publication of article on News Section
DID YOU MISS THE 2ND VIRTUAL BENCHLEARNING BY DATABENCH? CHECK THE RECORDING AND THE PRESENTATIONS NOW!
Abstract:
Promotion of Virtual BenchLearning (BDVA PPP Mailing List)
Abstract:
Promotion of 2nd DataBench Virtual BenchLearning -Atos Internal Newsletter Iberia (1100 contacts)Atos Research and Innovation Internal Newsletter (180 contacts)
Abstract:
During this 1-hour session organised in collaboration with BDVA, different speakers from DataBench project provided the audience with a framework and tools to assess the performance and impact of Big Data and AI technologies from the technical perspective, by providing real insights coming from DataBench and other projects active in the benchmarking domain in various industrial sectors.
In addition, representatives from other projects part of the BDV PPP such as DeepHealth and I-BiDaaS participated to share the challenges and opportunities they have identified on the use of Big Data, Analytics, AI.
The webinar featured:
An introductory session describing the need for DataBench in Big Data and AI centric analytics and its use in comparing the performance indicators that are important to organisations using them.
A demonstration of the DataBench Toolbox and the advantages it brings for organisations that need to assess their data analytics processes.
A discussion with other BDV PPP projects about their needs in terms of benchmarking big data.
Abstract:
Publication of article on News Section
VIRTUAL BENCHLEARNING: ASSESSING THE PERFORMANCE AND IMPACT OF BIG DATA, ANALYTICS AND AI
Abstract:
Promotion of 3rd Virtual BenchLearning (IDC Mailing List)
Abstract:
Promotion of 3rd Virtual BenchLearning (BDVA PPP Mailing List)
Abstract:
Presentation of research article: "CoreBigBench: Benchmarking big data core operations" Authors: Todor Ivanov, Ahmad Ghazal, Alain Crolotte, Pekka Kostamaa and Yoseph Ghazal
Abstract:
Significant effort was put into big data benchmarking with focus on end-to-end applications. While covering basic functionalities implicitly, the details of the individual contributions to the overall performance are hidden. As a result, end-to-end benchmarks could be biased toward certain basic functions. Micro-benchmarks are more explicit at covering basic functionalities but they are usually targeted at some highly specialized functions. In this paper we present CoreBigBench, a benchmark that focuses on the most common big data engines/platforms functionalities like scans, two way joins, common UDF execution and more. These common functionalities are benchmarked over relational and key-value data models which covers majority of data models. The benchmark consists of 22 queries applied to sales data and key-value web logs covering the basic functionalities. We ran CoreBigBench on Hive as a proof of concept and verified that the benchmark is easy to deploy and collected performance data. Finally, we believe that CoreBigBench is a good fit for commercial big data engines performance testing focused on basic engine functionalities not covered in end-to-end benchmarks.
MORE Information:
Abstract:
Significant effort was put into big data benchmarking with focus on end-to-end applications. While covering basic functionalities implicitly, the details of the individual contributions to the overall performance are hidden. As a result, end-to-end benchmarks could be biased toward certain basic functions. Micro-benchmarks are more explicit at covering basic functionalities but they are usually targeted at some highly specialized functions. In this paper we present CoreBigBench, a benchmark that focuses on the most common big data engines/platforms functionalities like scans, two way joins, common UDF execution and more. These common functionalities are benchmarked over relational and key-value data models which covers majority of data models. The benchmark consists of 22 queries applied to sales data and key-value web logs covering the basic functionalities. We ran CoreBigBench on Hive as a proof of concept and verified that the benchmark is easy to deploy and collected performance data. Finally, we believe that CoreBigBench is a good fit for commercial big data engines performance testing focused on basic engine functionalities not covered in end-to-end benchmarks.
MORE Information:
Abstract:
This is the second edition of this paper about Big Data challenges in the Smart Manufacturing Industry. In the 2018 edition, the focus was on the alignment between the BDVA Reference Model and the EFFRA Strategic Research and Innovation Agenda through analysis of their respective reference architectures and alignment of their respective technical challenges. The main outcome of the 2018 activities was the identification 56 Big Data technical challenges in the three Manufacturing Grand Scenarios of Smart Factory, Smart Product and Smart Supply Chain. Now in the 2020 edition, a twofold approach was chosen: on the one side (continuity evolutionary approach), we are updating the previous content, not just of the technology but also of the legal-sociobusiness landscape. On the other side, H2020 is in its last period and we need to think of more disruptive and revolutionary topics to be implemented along the future Multiannual Financial Framework (MFF 2021-2027) and respectively in both Digital Europe and Horizon Europe programs. This second approach will continue in the work of the SMI group on elaborating the bridge between H2020 and Horizon / Digital Europe.
MORE Information:
Abstract:
In the era of Big Data and AI, it is challenging to know alltechnical and business advantages of the emerging technologies. The goalof DataBench is to design a benchmarking process helping organizationsdeveloping Big Data Technologies (BDT) to reach for excellence andconstantly improve their performance, by measuring their technologydevelopment activity against parameters of high business relevance. Thispaper focuses on the internals of the DataBench framework and presentsour methodological workow and framework architecture.
MORE Information:
latest news
Abstract:
The use of big data in organizations involves numerous decisions on the business and technical side. While the assessment of technical choices has been studied introducing technical benchmarking approaches, the study of the value of big data and of the impact of business key performance indicators (KPI) on technical choices is still an open problem. The paper discusses a general analysis framework for analyzing big data projects wrt both technical and business performance indicators, and presents the initial results emerging from a first empirical analysis conducted within European companies and research centers within the European DataBench project and the activities of the benchmarking working group of the Big Data Value Association (BDVA). An analysis method is presented, discussing the impact of confidence and support measurements and two directions of analysis are studied: the impact of business KPIs on technical parameters and the study of most important indicators both on the business and on the technical side, for specific industry sectors, with the goal of identifying the most relevant design and assessment criteria.
MORE Information:
Abstract:
Organisations rely on evidence from the Benchmarking domainto provide answers on how their processes are performing.There is extensive information on how and whyto perform technical benchmarks for the specific managementand analytics processes, but there is a lack of objective,evidence-based methods to measure the correlationbetween Big Data Technology (BDT) benchmarks and anorganisations business benchmarks and demonstrate returnon investment (ROI). The DataBench project addresses thissignificant gap in the current benchmarking communitysactivities, by providing certifiable benchmarks and evaluationschemes of BDT performance of high business impactand industrial significance.
MORE Information:
Abstract:
NA
Abstract:
NA
Abstract:
NA
Abstract:
This white paper reports on the current view of the DataBench Toolbox architecture and main functional elements as described in the DataBench deliverable D3.1. The goal of the DataBench Toolbox is to provide a way of reusing existing big data benchmarking efforts under a common framework, providing therefore a way to select, download and homogenize technical and business indicators.
Abstract:
Relating big data business and technical performance indicators
Barbara Pernici, Chiara Francalanci, Angela Geronazzo, Lucia Polidori, Stefano Ray, Leonardo Riva, Politecnico di Milano, Arne Jørgen Berre, SINTEF, Todor Ivanov, Goethe University
itAIS 2018, Pavia
12 October 2018
MORE Information:
Abstract:
https://bit.ly/2DvwFaD / https://bit.ly/2GEKtCW / https://bit.ly/2RTzl77
Abstract:
https://bit.ly/2N2yjVr / https://bit.ly/2TJXU83 / https://bit.ly/2UQB52I / https://bit.ly/2UWO8zP / https://bit.ly/2N2IrgP / https://bit.ly/2V0Dubp
Abstract:
https://bit.ly/2SL5qCq / https://bit.ly/2SyFHhu / https://bit.ly/2Bzlldh / https://bit.ly/2N3rFOw
Abstract:
DataBench press release about the participation at KDD 2018 was published at the BDVA PPP Newsletter (September) and BVDA Website
http://www.big-data-value.eu/the-databench-project-was-presented-at-the-kdd-conference-2018-in-london/
http://www.bdva.eu/sites/default/files/PDF_Newsletter_SEPT_2018.pdf
Abstract:
DataBench press release about the paper "Big Data Key Performance Indicators" for the iTAIS Conference was published at the BDVA PPP Newsletter (September) and BVDA Website
http://www.big-data-value.eu/databench-paper-big-data-key-performance-indicators-accepted-at-itais-conference/
http://www.bdva.eu/sites/default/files/PDF_Newsletter_SEPT_2018.pdf
Abstract:
DataBench press release about the performance of the project during 2018 was published at the BDVA PPP Newsletter (December) and BVDA Website
http://www.big-data-value.eu/databench-project-finishes-its-first-year-of-activity-with-successful-results/
http://www.bdva.eu/sites/default/files/PDF_Newsletter_DEC_2018.pdf
Abstract:
DataBench press release about the creation of a Benchmarking Community was published at the BDVA PPP Newsletter (January) and BVDA Website
http://www.big-data-value.eu/databench-call-for-action-are-you-working-on-projects-related-to-big-data-that-uses-or-develop-benchmarks/?et_fb=1
http://www.bdva.eu/sites/default/files/PDF_Newsletter_January_2019.pdf
Abstract:
First webinar scheduled for the BDVe Webinar Series: DataBench Project
Abstract:
Results of the webinar held by DataBench with Arne Berre and Gabriella Cattaneo
Abstract:
DataBench on the Atos Research and Innovation Newsletter (Events section), for the promotion of the BDVe Series webinar with Arne Berre and Gabriella Cattaneo, and the acceptance of the paper for the itAIS Conference
Abstract:
DataBench on the Atos Research and Innovation Newsletter (Events section), for the promotion of the project's participation at the EBDVF2018
Abstract:
DataBench on the Atos Research and Innovation Newsletter (Events section), for the promotion of the project's participation at the ICT2018
Abstract:
DataBench on the Atos Research and Innovation Newsletter (Projects section), for the promotion of the project's new Scientific Publications Section on the website
Abstract:
Flyer to download the content of the WhitePaper: DataBench Toolbox
Abstract:
Benchmarking for Big Data Applications with the DataBench Framework - presented at the Workshop The Second IEEE International Workshop on Benchmarking, Performance Tuning and Optimization for Big Data Applications (BPOD) in 2018 IEEE International Conference on Big Data
Abstract:
Presence at the booth and distribution of marketing materials, audience engagement, DataBench video display
Abstract:
This IDC Survey Spotlight provides details on the metrics used by European companies to evaluate their Big Data and analytics (BDA) environments and benchmark the impact of BDA solutions already in place. Respondents include European companies with more than 10 employees covering the most important industry sectors.
The analysis is based on data from IDC's EMEA EU DataBench Survey, October 2018.
Abstract:
This IDC Survey Spotlight provides details on the importance for European companies to evaluate their Big Data and Analytics (BDA) environments with the right set of technical performance metrics. Respondents are European companies with more than 10 employees, covering the most important industry sectors.
The analysis is based on data from IDC's EMEA EU DataBench Survey in October 2018.
Abstract:
This IDC Survey presents key findings from IDC's European DataBench Survey on the benefits that European companies are achieving or expect to achieve through the deployment of Big Data and analytics (BDA) solutions. Conducted in October 2018, the survey covered 700 respondents in key industry sectors in 11 European countries, grouped by region (Nordics, South Europe, and Central and Eastern Europe). The survey, as part of a European Commission-funded project, was conducted to understand the current European BDA environment and provide organizations with actionable insights and tools to benchmark themselves on the European market. For more information about the project, visit www.databench.eu.
"European companies are achieving important benefits from BDA deployment, particularly in deepening their understanding of customer behavior. Companies must adopt prescriptive and predictive analytics techniques if they want remain competitive and improve market performance," said Erica Spinoni, research analyst, IDC Europe.
Abstract:
This IDC Survey Spotlight examines how much European organizations have integrated their Big Data analytics (BDA) environments with key Innovation Accelerator technologies. Survey respondents are European companies with more than 10 employees and cover the most important industry sectors.
The analysis is based on data from IDC's European DataBench Survey, October 2018.
MORE Information:
Abstract:
This IDC Survey presents the key findings from IDC's European DataBench Survey on the approach to data management and the benefits derived from the deployment of Big Data analytics (BDA) solutions. The survey, conducted in October 2018, included 700 respondents in Europe and covered the most important industry sectors. The companies sampled are based in the most important Western and Eastern European countries.
The survey, which is part of a project funded by the European Commission, was conducted to understand the current European BDA environment and provide European companies with fruitful insights and tools to benchmark themselves on the European market. Visit https://www.databench.eu/ for more information about the project.
"Real-time solutions for Big Data analytics are powerful tools that can enable businesses to improve their management styles, deepen their understanding of daily routines, and achieve better performance and process improvements," said Erica Spinoni, research analyst, IDC Europe.
MORE Information:
Abstract:
DataBench press release about the results of the survey were published at the BDVA PPP Newsletter (February) and BVDA Website
http://www.big-data-value.eu/databench-project-released-the-results-of-its-survey-on-700-european-companies-focused-on-their-actual-or-planned-use-of-big-data-and-analytics/
http://www.bdva.eu/sites/default/files/PDF_Newsletter_February_2019_compressed.pdf
Abstract:
DataBench project has developed the first of a series of infographics presenting the results of a survey about Big Data and Analytics (BDA) on 700 European companies.
The survey carried out by DataBench between September and October 2018 investigated the actual or planned use of BDA by 700 European businesses in 11 EU Member States and identified the relevance of business KPIs for BDA users. The detailed analysis is presented in the report D2.2 of DataBench Project.
The infographic presents the main highlights of the survey concerning:
Level of adoption of BDA solutions by EU businesses from different Industries and Business Areas
Business goals driving the adoption of this type of technologies
Most relevant KPI categories for measuring the business impact
Achievements and expected benefits for using BDA solutions
Current use of analytic techniques
Current level of Big Data skills gap
https://www.databench.eu/wp-content/uploads/2019/03/databench_infographic1.pdf
https://www.databench.eu/big-data-analytics-big-opportunities-for-eu-companies-have-a-look-at-the-first-databench-infographic/
Abstract:
DataBench in the Atos Research and Innovation Newsletter (Projects section), about the General Meeting held by Atos in Madrid
Abstract:
DataBench on the Atos Research and Innovation Newsletter (Events section), to communicate the participation of DataBench at the event #OpenExpoEurope in Madrid.
Abstract:
https://twitter.com/AtosES/status/1106126834205421568 | March 14 (Twitter)
https://twitter.com/AtosES/status/1106474123025399808 | March 15 (Twitter)
https://twitter.com/AtosES/status/1103660852987740160 | March 7 (Twitter)
https://twitter.com/AtosES/status/1103932371055722496 | March 8 (Twitter)
https://www.linkedin.com/feed/update/urn:li:activity:6511537689295228928 | March 14 (LinkedIn)
Abstract:
https://twitter.com/AtosES/status/1106109974307049472 |
https://twitter.com/AtosES/status/1103225970230726657
Abstract:
DataBench press release about the infographic developed with the main highlights of the survey was published at the BDVA PPP Newsletter (March) and BDVA Website.
http://www.big-data-value.eu/databench-infographic-based-on-a-survey-on-700-european-companies/
Abstract:
BigBench standardized as TPCx-BB is a popular application benchmark that targets Big Data storage and processing systems. BigBench V2 addresses some of the BigBench limitations by introducing a new simplified data model, semi-structured web logs in JSON file format and new queries mandating late binding. However, it still covers only batch processing workloads and the Big Data velocity characteristic is not addressed. This work extends the BigBench V2 benchmark with a data streaming component that simulates typical statistical and predictive analytics queries in a retail business scenario. Our approach is to preserve the existing BigBench design and introduce a new streaming component that supports two data streaming modes: active and passive. In active mode, the data stream generation and processing happen in parallel, whereas in passive mode, the data stream is pre-generated in advance before the actual stream processing. The stream workload consists of five queries inspired by the existing 30 BigBench queries. To validate the proposed streaming extension, the two streaming modes were implemented and tested using Kafka and Spark Streaming. The experimental results prove the feasibility of our benchmark design. Finally, we outline design challenges and future plans for improving the proposed BigBench extension.
Abstract:
DataBench press release about the call for industry case studies was published in the BDVA PPP Newsletter (May) and BVDA Website
Abstract:
DataBench (Evidence Based Big Data Benchmarking to Improve Business Performance) is a H2020 project started in January 2018 with the goal of providing indicators and metrics to evaluate benchmarks available for assessing Big Data technologies, including data analysis systems. Existing benchmarks range from micro-benchmarks, focusing on a specific technological aspect, to end-to-end benchmarks to evaluate business intelligence applications as a whole and identify maximum workloads and possible weaknesses. The talk will present the indicators which have been studied in the project and include Business features, Big Data Applications Features, Platform and Architecture Features, and Technical Benchmark Features. The results of an initial survey of use of big data technologies in Europe with approximately 700 respondents will be illustrated and the requirements for different industry sectors analyzed. A specific focus on evaluating big data technologies for scientific data will be presented and discussed, with the aim of gathering further requirements for benchmarking tools for scientific applications, and in particular in the combustion. Some early results and further requirements for innovative cooperative information systems for storing, curating, managing, and analyzing experiments and models in the combustion domain will also be illustrated. DataBench web site: https://www.databench.eu/
MORE Information:
Abstract:
Moderation of KDD 2018 Project Showcase
Abstract:
n/a
Abstract:
Cross-lingual Real-Time Global Media Monitoring
Abstract:
At times our society feels like a runaway train. Technologies in artificial intelligence, biotech, nanotech, to name only a few fields, are developing at an extreme pace, but are not accompanied by a strategic analysis of their impact not only on our daily lives but the whole of humanity – on social relations, on our emotional and biological selves, as well as on our legal systems and regulatory frameworks. The infrastructure for such fundamental changes is not in place.
The panel will explore cutting edge technological advances and how they will affect our lives and the human race as a whole. What changes can we foresee in the coming decades? How can we ensure that they will lead us to a better existence and that we use technology to improve our lives and not to perpetuate the cycles of global violence and wars that mark human history?
Abstract:
Can data science help reduce police violence and misconduct? Can it help increase retention of patients in care? Can it help prevent children from getting lead poisoning? Can it help cities better target limited resources to improve lives of citizens? Were all aware of the hype around data science and related buzzwords right now but turning this hype into social impact takes cross-disciplinary training, teams, and methods.
Abstract:
Benchmarking for Digital Platforms with Big Data, IoT, AI, Cloud, HPC and CyberSecurity is being introduced based on European activities in this area, in particular related to the DataBench project and work in BDVA, AIOTI and ECSO.
Abstract:
Benchmarking session at EBDVF'2018
Introduction - Arne Berre/Axel Ngonga
Designing Big Data Benchmarks - Irini Fundulaki
LDBC - Peter Boncz
DataBench - Gabriella Cattaneo/Tomas P. Lobo
Holistic Benchmarking - Axel Ngonga/Gayane Sedrakyan
Benchmarking as a service - Pavel Smirnov (AGT)
The EU Big Data Inducement price challenge - Kimmo Rossi (EC)
Presentation of solutions of winners of the EU Big Data Inducement Prize
Summary and Discussion
Abstract:
A contribution for June PPP Newsletter was sent to promote the new page of DataBench in LinkedIn.
Abstract:
The contextual complexity around big data is increasing exponentially. Governance and regulation, data volume and data variety, the velocity of transmission and computation, and the potential for adoption of open source technologies are a few of the many challenges that companies are facing in the Big Data environment. In addition, digital transformation (DX) in Europe is starting to play a significant role as a basis to enhance the European market position and for a future gain in competitiveness.
MORE Information:
Abstract:
Distributed big data processing and analytics applications demand a comprehensive end-to-end architecture stack consisting of big data technologies. However, there are many possible architecture patterns (e.g. Lambda, Kappa or Pipeline architectures) to choose from when implementing the application requirements. A big data technology in isolation may be best performing for a particular application, but its performance in connection with other technologies depends on the connectors and the environment. Similarly, existing big data benchmarks evaluate the performance of different technologies in isolation, but no work has been done on benchmarking big data architecture stacks as a whole. For example, BigBench (TPCx-BB) may be used to evaluate the performance of Spark, but is it applicable to PySpark or to Spark with Kafka stack as well? What is the impact of having different programming environments and/or any other technology like Spark? This vision paper proposes a new category of benchmark, called ABench, to fill this gap and discusses key aspects necessary for the performance evaluation of different big data architecture stacks.
MORE Information:
Abstract:
In the Big Data era, stream processing has become a common requirement for many data-intensive applications. This has lead to many advances in the development and adaption of large scale streaming systems. Spark and Flink have become a popular choice for many developers as they combine both batch and streaming capabilities in a single system. However, introducing the Spark Structured Streaming in version 2.0 opened up completely new features for SparkSQL, which are alternatively only available in Apache Calcite. This work focuses on the new Spark Structured Streaming and analyses it by diving into its internal functionalities. With the help of a micro-benchmark consisting of streaming queries, we perform initial experiments evaluating the technology. Our results show that Spark Structured Streaming is able to run multiple queries successfully in parallel on data with changing velocity and volume sizes.
MORE Information:
Abstract:
BigBench standardized as TPCx-BB is a popular application benchmark that targets Big Data storage and processing systems. BigBench V2 addresses some of the BigBench limitations by introducing a new simplified data model, semi-structured web logs in JSON file format and new queries mandating late binding. However, it still covers only batch processing workloads and the Big Data velocity characteristic is not addressed. This work extends the BigBench V2 benchmark with a data streaming component that simulates typical statistical and predictive analytics queries in a retail business scenario. Our approach is to preserve the existing BigBench design and introduce a new streaming component that supports two data streaming modes: active and passive. In active mode, the data stream generation and processing happen in parallel, whereas in passive mode, the data stream is pre-generated in advance before the actual stream processing. The stream workload consists of five queries inspired by the existing 30 BigBench queries. To validate the proposed streaming extension, the two streaming modes were implemented and tested using Kafka and Spark Streaming. The experimental results prove the feasibility of our benchmark design. Finally, we outline design challenges and future plans for improving the proposed BigBench extension.
MORE Information:
Abstract:
This chapter reviews the evolution of the analytics benchmarks and their current state today (as of 2017). It starts overview of the most relevant benchmarking organizations their benchmark standards and outlines the latest benchmark development and initiatives targeting the emerging Big Data Analytics systems. Last but not least the typical benchmark components are described as well as the different goals that these benchmarks try to achieve.
MORE Information:
Abstract:
In the era of Big Data and AI, it is challenging to know all technical and business advantages of the emerging technologies. The goal of DataBench is to design a benchmarking process helping organizations developing Big Data Technologies (BDT) to reach for excellence and constantly improve their performance, by measuring their technology development activity against parameters of high business relevance. This paper focuses on the internals of the DataBench framework and presents our methodological workflow and framework architecture.
MORE Information:
Abstract:
DataBench promotion @IDC booth
Abstract:
The proliferation of big data technology and faster computing systems led to pervasions of AI based solutions in our life. There is need to understand how to benchmark systems used to build AI based solutions that have a complex pipeline of pre-processing, statistical analysis, machine learning and deep learning on data to build prediction models. Solution architects, engineers and researchers may use open-source technology or proprietary systems based on desired performance requirements. The performance metrics may be data pre-processing time, model training time and model inference time. We do not see a single benchmark answering all questions of solution architects and researchers. This tutorial covers both practical and research questions on relevant Big Data and Analytics benchmarks.
MORE Information:
Abstract:
DataBench project goal is to design a benchmarking process helping European organizations developing Big Data Technologies to reach for excellence and improve their performance, by measuring their technology development activity against parameters of high business relevance. DataBench is investigating existing Big Data benchmarking tools and projects, identifying the main gaps and developing a robust set of metrics to compare technical results coming from those tools that will be available on the main outcome of the project the DataBench Toolbox.
In this 45 minute session titled Big Data – Benchmark your way to excellent business performance our speakers and representative of the DataBench Project, Gabriella Cattaneo and Erica Spinoni from IDC, presented the results of the research on Big Data business impacts by European industries. Listen to this webinar to understand how different industries implement Big Data Technologies and benchmark activities, for measuring business-related KPIs such as revenue growth, profitability and cost savings, customer satisfaction, among others, based on a survey conducted by IDC to 700 EU-industry-representative companies.
Abstract:
66 Likes 8 Media Clicks 26 Shares 25 Detail Expands 2 Comments 100% Organic Engagement
Abstract:
Promotion of Virtual BenchLearning (BDVA PPP Mailing List)
Abstract:
Publication of article on News Section BIG DATA – BENCHMARK YOUR WAY TO EXCELLENT BUSINESS PERFORMANCE
Abstract:
Promotion of Virtual BenchLearning (IDC Mailing List)
Abstract:
Promotion of DataBench Virtual BenchLearning -Atos Internal Newsletter Iberia (1100 contacts)Atos Research and Innovation Internal Newsletter (180 contacts)
Abstract:
In this 1 hour session titled Success stories on Big Data & Analytics Use Cases + DataBench Toolbox our speakers and participants of the DataBench Project, Chiara Francalanci from Politecnico di Milano and Tomás Pariente from Atos Research and Innovation, presented the most relevant use cases and success stories on Retail and Manufacturing sectors implementing Big Data, Analytics and Benchmarking tools. The session included a demonstration of the main outcome of the project, the DataBench Toolbox, which allows different types of users from both technical and business roles, to search, select and deploy Big Data benchmarking tools to generate unified technical metrics and derive business KPIs.
Abstract:
To update metrics
Abstract:
Publication of article on News Section DID YOU MISS THE 1ST VIRTUAL BENCHLEARNING BY DATABENCH? CHECK THE RECORDING AND THE PRESENTATIONS NOW!
Abstract:
Publication of article on News Section
SUCCESS STORIES ON BIG DATA & ANALYTICS USE CASES AND DATABENCH TOOLBOX
Abstract:
Publication of article on News Section
DID YOU MISS THE 2ND VIRTUAL BENCHLEARNING BY DATABENCH? CHECK THE RECORDING AND THE PRESENTATIONS NOW!
Abstract:
Promotion of Virtual BenchLearning (BDVA PPP Mailing List)
Abstract:
Promotion of 2nd DataBench Virtual BenchLearning -Atos Internal Newsletter Iberia (1100 contacts)Atos Research and Innovation Internal Newsletter (180 contacts)
Abstract:
During this 1-hour session organised in collaboration with BDVA, different speakers from DataBench project provided the audience with a framework and tools to assess the performance and impact of Big Data and AI technologies from the technical perspective, by providing real insights coming from DataBench and other projects active in the benchmarking domain in various industrial sectors.
In addition, representatives from other projects part of the BDV PPP such as DeepHealth and I-BiDaaS participated to share the challenges and opportunities they have identified on the use of Big Data, Analytics, AI.
The webinar featured:
An introductory session describing the need for DataBench in Big Data and AI centric analytics and its use in comparing the performance indicators that are important to organisations using them.
A demonstration of the DataBench Toolbox and the advantages it brings for organisations that need to assess their data analytics processes.
A discussion with other BDV PPP projects about their needs in terms of benchmarking big data.
Abstract:
Publication of article on News Section
VIRTUAL BENCHLEARNING: ASSESSING THE PERFORMANCE AND IMPACT OF BIG DATA, ANALYTICS AND AI
Abstract:
Promotion of 3rd Virtual BenchLearning (IDC Mailing List)
Abstract:
Promotion of 3rd Virtual BenchLearning (BDVA PPP Mailing List)
Abstract:
Presentation of research article: "CoreBigBench: Benchmarking big data core operations" Authors: Todor Ivanov, Ahmad Ghazal, Alain Crolotte, Pekka Kostamaa and Yoseph Ghazal
Abstract:
Significant effort was put into big data benchmarking with focus on end-to-end applications. While covering basic functionalities implicitly, the details of the individual contributions to the overall performance are hidden. As a result, end-to-end benchmarks could be biased toward certain basic functions. Micro-benchmarks are more explicit at covering basic functionalities but they are usually targeted at some highly specialized functions. In this paper we present CoreBigBench, a benchmark that focuses on the most common big data engines/platforms functionalities like scans, two way joins, common UDF execution and more. These common functionalities are benchmarked over relational and key-value data models which covers majority of data models. The benchmark consists of 22 queries applied to sales data and key-value web logs covering the basic functionalities. We ran CoreBigBench on Hive as a proof of concept and verified that the benchmark is easy to deploy and collected performance data. Finally, we believe that CoreBigBench is a good fit for commercial big data engines performance testing focused on basic engine functionalities not covered in end-to-end benchmarks.
MORE Information:
Abstract:
Significant effort was put into big data benchmarking with focus on end-to-end applications. While covering basic functionalities implicitly, the details of the individual contributions to the overall performance are hidden. As a result, end-to-end benchmarks could be biased toward certain basic functions. Micro-benchmarks are more explicit at covering basic functionalities but they are usually targeted at some highly specialized functions. In this paper we present CoreBigBench, a benchmark that focuses on the most common big data engines/platforms functionalities like scans, two way joins, common UDF execution and more. These common functionalities are benchmarked over relational and key-value data models which covers majority of data models. The benchmark consists of 22 queries applied to sales data and key-value web logs covering the basic functionalities. We ran CoreBigBench on Hive as a proof of concept and verified that the benchmark is easy to deploy and collected performance data. Finally, we believe that CoreBigBench is a good fit for commercial big data engines performance testing focused on basic engine functionalities not covered in end-to-end benchmarks.
MORE Information:
Abstract:
This is the second edition of this paper about Big Data challenges in the Smart Manufacturing Industry. In the 2018 edition, the focus was on the alignment between the BDVA Reference Model and the EFFRA Strategic Research and Innovation Agenda through analysis of their respective reference architectures and alignment of their respective technical challenges. The main outcome of the 2018 activities was the identification 56 Big Data technical challenges in the three Manufacturing Grand Scenarios of Smart Factory, Smart Product and Smart Supply Chain. Now in the 2020 edition, a twofold approach was chosen: on the one side (continuity evolutionary approach), we are updating the previous content, not just of the technology but also of the legal-sociobusiness landscape. On the other side, H2020 is in its last period and we need to think of more disruptive and revolutionary topics to be implemented along the future Multiannual Financial Framework (MFF 2021-2027) and respectively in both Digital Europe and Horizon Europe programs. This second approach will continue in the work of the SMI group on elaborating the bridge between H2020 and Horizon / Digital Europe.
MORE Information:
Abstract:
In the era of Big Data and AI, it is challenging to know alltechnical and business advantages of the emerging technologies. The goalof DataBench is to design a benchmarking process helping organizationsdeveloping Big Data Technologies (BDT) to reach for excellence andconstantly improve their performance, by measuring their technologydevelopment activity against parameters of high business relevance. Thispaper focuses on the internals of the DataBench framework and presentsour methodological workow and framework architecture.
MORE Information: