15. Data Analytics on top of Quality of Service Cloud Computing Infrastructures for Massively connected societies

Author(s): 
Ignacio Blanquer (Universitat Politècnica de València), Wagner Meira Jr. (Universidade Federal de Minas Gerais)
Focus area: 
This position paper focuses on the creation of research networks by means of cloud services and the cooperation between Europe and Brazil. The synergies between Europe and Brazil in this matter will create bridges that could transfer the impact between continents. In this context, we expect European cloud and Big Data solutions to find a demand on Brazilian researchers, to create durable cooperation activities in the area of QoS for clouds, among others. Brazilians can have a good opportunity to extend their data processing models to other datasets and populations in Europe.
Who stands to benefit and how: 
BIGSEA will develop a framework for ensuring the QoS of data analytics services on top of cloud computing infrastructures ensuring security and privacy. Application developers will leverage application porting, data analytics, automatic elasticity and Big Data privacy services to faster develop Big Data applications. BIGSEA will also develop a user scenario focusing on massive connected societies. This scenario will base on the BIGSEA services to develop novel strategies and techniques for efficiently integrating multidimensional layers of heterogeneous data, with different standards, data types, time scales, geographical coverage, and data quality, all of great interest for the general public.
Focus of your position paper: 
EU-Bra BIGSEA, “Europe – Brazil Collaboration of Big Data Scientific Research Through Cloud-Centric Applications” is a project funded under the third coordinated call Europe – Brazil by the European Commission and the Brazilian Ministry of Ciência, Tecnologia e Inovação to create a sustainable international (European and Brazilian) cooperation activity in the area of cloud services for Big Data analytics. EU-Bra BIGSEA aims at addressing main issues on the Quality of Service of cloud infrastructures for Big Data analytics. Predictive models for vertical and horizontal elasticity will be developed for both computing and storage, exposing simplified programmatic interfaces to upper-layer applications. Public clouds offer customized high-level services for data analytics, which depend on specific features of the provider. This reduce the capability of migrating the virtual appliances from one provider to another. EU-Bra BIGSEA will use standards on the specification of Virtual Appliances (such as TOSCA) and a plug-in based support of multiple infrastructures (including major providers of public and on-premise clouds) to reduce vendor lock-in. High-level data analytics services will provide a console and will expose APIs for their usage from standard programming models, transparently to the user. Users are expected to be able to deploy their own customized elastic virtual computing infrastructure on top of their data storages. The use case for the validation of the platform focuses on city traffic analysis, exploiting data from multiple and heterogeneous sources. The project plans to develop novel strategies and techniques for efficiently integrating multidimensional layers of heterogeneous data, with different standards, data types, time scales, geographical coverage, and data quality. It will use the platform to propose, instantiate and validate novel descriptive modelling methodologies for GES3 data considering its temporal, spatial and dynamic nature. These analyses will be used to understand the relevant relationships and patterns inherent to traffic, environmental and user data. EU-Bra BIGSEA will also build prediction models that support route recommendation, as well as traffic and public transportation planning. The models will leverage the QoS and security capabilities of the EU-Bra BIGSEA platform for exposing the service to users. Finally, security on Big Data is an issue that requires a major cooperation and development. Traditional data security measures are not efficient when applied to the large volumes and high velocities of Big Data, which requires different approaches and technologies.