Publications — Gonzalo Zarza

La ingeniería del Big Data, cómo trabajar con datos

Book [Spanish]
J.J. López Murphy, G. Zarza (2017)

Big data es más que una propiedad de una masa de datos o un conjunto de tecnologías. Utilizado efectivamente es el vehículo para implementar un paradigma data driven, tal vez el mayor desafío y salto de calidad al que pueden aspirar las organizaciones actualmente, y una necesidad estratégica para ser competitivos en el futuro.

Este libro recorre los estadios necesarios para ejecutar eficazmente estas iniciativas: un entendimiento de los datos y la información, los tipos de tecnologías, cómo comenzar un proyecto desde cero, errores de novatos, alcanzando la madurez y perspectivas sobre el futuro.

GET IT

Multipath Fault-tolerant Routing Policies to deal with Dynamic Link Failures in High Speed Interconnection Networks Authors

PhD Thesis [English]
G. Zarza (2011)

Interconnection networks communicate and link together the processing units of modern high-performance computing systems. In this context, network faults have an extremely high impact since most routing algorithms have not been designed to tolerate faults. Because of this, as few as one single link failure may stall messages in the network, leading to deadlock configurations or, even worse, prevent the finalization of applications running on computing systems. In this thesis we present fault-tolerant routing policies based on concepts of adaptability and deadlock freedom, capable of serving interconnection networks affected by a large number of link failures. Two contributions are presented throughout this thesis, namely: a multipath fault-tolerant routing method, and a novel and scalable deadlock avoidance technique […]

PDF

Method, system and router for avoiding blockages in an interconnection network

Patent Draft [Spanish, English & French]
G. Zarza, D. Franco Puntes, D. Lugones, E. Luque (2013)

The method comprises: a) detecting a situation prone to a blockage; and b) identifying a routing cycle involved in the detected situation prone to a blockage by virtue of a router (r i) carrying out the following substeps by means of an asynchronous intra-router and inter-router search mechanism which does not require the use of timers: b1) composing and sending an identification message from an input buffer (a i j) of the router (r i) to an output buffer (b h k) of another router (r h); and b2) receiving the identification message in the output buffer (b i k) associated with said input buffer (a i j) of the router (r i) which composed said message, following the retransmission thereof by at least another router (r h) from an input buffer (a h j) thereof. The system and the router are adapted to implement the method proposed by the invention.

DRAFT

An analysis on how can AI empower the senior population in their access to banking services

Short Paper [English]
Y.A. Peral, E. Concepción, I. López-Samaniego, & G. Zarza (2022)

This article addresses the banking interaction needs of the senior population, devoting special attention to the unique needs of this age demographic delineated by the natural cognitive decline caused by ageing. Despite being a compelling business audience, the financial sector, and particularly banking institutions, have significantly underestimated the importance of designing a customer experience that is better adapted to the needs of senior users. A compelling yet achievable solution to this problem is the adoption of […]

PDF

Architectural Design Criteria for Evolvable Data-Intensive Machine Learning Platforms

Full Paper [English]
Gonzalo Zarza, J.J. López Murphy (2020)

Recent advances in Artificial Intelligence (AI) have fostered a widespread adoption of Machine Learning (ML) capabilities within many products and services. However, most organizations are not well suited to fully exploit the strategic advantages of AI. Implementing ML solutions is still a complex endeavor due to the fast-pace evolution and the intrinsic exploratory nature of state-of-the-art ML techniques. In many respects, the evolution of data platforms through highly parallel or high performance technologies have focused on the capacity to massively process the elements consumed by these ML models […]

PDF

Data Science & Engineering into Food Science: A novel Big Data Platform for Low Molecular Weight Gelators' Behavioral Analysis

Full Paper [English]
V. Cuello, M.G. Corradini, M. Rogers & G. Zarza (2020)

The objective of this article is to introduce a comprehensive end-to-end solution aimed at enabling the application of state-of-the-art Data Science and Analytic methodologies to a food science related problem. The problem refers to the automation of load, homogenization, complex processing and real-time accessibility to low molecular-weight gelators (LMWGs) data to gain insights into their assembly behavior, i.e. whether a gel can be mixed with an appropriate solvent or not. Most of the work within the field of Colloidal and Food Science in relation to LMWGs have centered on identifying adequate solvents that can generate stable gels and […]

PDF

History-Aware Adaptive Routing Algorithm For Energy Saving in Interconnection Networks

Full Paper [English]
H. Nguyen, G. Zarza, D. Franco, E. Luque (2013)

The increase of link speeds in the interconnection networks is evident both inside and outside of a datacenter. Thus it contributes an increasing portion of the power budget of the interconnection system. Link power management has been receiving more attention and many mechanisms were proposed. The emerging bit-serial link technology allows the links to work with different numbers of lanes & speeds. When the traffic load is slight, links are put in low-speed mode and consume less energy. However, links working in the low speed mode result in the increase in serialization latency. We propose a routing algorithm that takes into account the history usage of the links to focus network traffic in a small subset of high-speed links. It keeps high-speed links busier and leaves low-speed links with more idle time. Thus the mechanism saves energy and reduces the incurred serialization latency.

PDF

An innovative teaching strategy to understand high-performance systems through performance evaluation

Article [English]
G. Zarza, D. Lugones, D. Franco, E. Luque (2012)

Nowadays, the study of high-performance computing (HPC) is one of the essential aspects of postgraduate pro-grammes in Computational Science. However, university education in HPC often suffers from a significant gap between theoretical concepts and the practical experience of students. To face this challenge, we have implemented an innovative teaching strategy to provide students appropriate resources to ease the assimilation of theoretical con-cepts, while improving their practical experience through the use of teaching tools and resources specifically designed to promote active learning. We have used the proposed strategy to organize the module of Parallel Computers and Architectures of the Master's in High-Performance Computing […]

PDF

ClusterGUI, an Application to Launch OPNET Simulations within Resource Managed Environments

Full Paper [English]
C. Nuñez, G. Zarza, D. Lugones, J. Navarro, D. Franco, E. Luque (2011)

This work presents ClusterGUI; an application specialized in the submission and control of OPNET simulation jobs to the Oracle Grid Engine (previously known as Sun Grid Engine - SGE) Resource Manager, using the Distributed Resource Management Application API (DRMAA) libraries. With a Resource Manager acting as a middleware between cluster resources and simulation runs, traditional tools cannot be directly used to launch the simulations. With ClusterGUI we can send multiple simulations to a resource manager in a cluster in order to fully use available resources. We compare simulations results by launching many standalone runs against a resource-managed distributed approach, to perform parametric studies more quickly by launching simulations concurrently.

PDF

Non-blocking adaptive cycles: Deadlock avoidance for fault-tolerant interconnection networks

Poster Paper [English]
G. Zarza, D. Lugones, D. Franco, E. Luque (2010)

The interconnection network communicates and links together the processing units of modern high-performance computing systems. In this context, network faults have an extremely high impact since most routing algorithms were not designed to tolerate faults. Because of this, just a single fault may stall messages in the network, preventing the finalization of applications, or may lead to deadlocked configurations. In this paper we introduce a scalable deadlock avoidance technique specifically designed to deal with large interconnection networks suffering from a large number of dynamic faults. Our method is based on adding one-slot deadlock avoidance buffers and does not require the use of any virtual channels. Additionally, fully-adaptive routing algorithms may be designed on the basis of our proposal.

PDF

Deadlock avoidance for interconnection networks with multiple dynamic faults

Full Paper [English]
G. Zarza, D. Lugones, D. Franco, E. Luque (2010)

The intensive and continuous use of high-performance computing systems for executing computationally intensive applications, coupled with the large number of elements that make them up, dramatically increase the likelihood of failures during their operation. Clearly, network faults have an extremely high impact because most routing algorithms are not designed to tolerate faults. In such algorithms, just a single fault may lead to deadlocked configurations thus preventing the correct finalization of applications. This paper introduces a new deadlock avoidance mechanism for routing algorithms designed to deal with multiple dynamic faults. The mechanism is based on adding a small-sized buffer and applying a simple set of actions when accessing output buffers with limited free space. Unlike typical static solutions, this proposal allows the design of routing algorithms capable of treating an unbounded number of […]

PDF

FT-DRB: A Method for Tolerating Dynamic Faults in High-Speed Interconnection Networks

Full Paper [English]
G. Zarza, D. Lugones, D. Franco, E. Luque (2010)

The intensive and continuous use of high-performance computing systems for executing computationally intensive applications, coupled with the large number of elements that make them up, dramatically increase the likelihood of failures during their operation. The interconnection network is a critical part of such systems, therefore, network faults have an extremely high impact because most routing algorithms are not designed to tolerate faults. In such algorithms, just a single fault may stall messages in the network, preventing the finalization of applications, or may lead to deadlocked configurations. This paper introduces a novel fault-tolerant routing method provided with a new deadlock avoidance technique designed to solve an unbounded number of faults appearing at random during system operation. Our method provides escape paths for the stalled messages. In addition, the routing algorithm configures […]

PDF

Fault-tolerant Routing for Multiple Permanent and Non-permanent Faults in HPC Systems

Full Paper [English]
G. Zarza, D. Lugones, D. Franco, E. Luque (2010)

The interconnection network communicates and links together the processing units of modern highperformance computing systems. In this context, network faults have an extremely high impact since most routing algorithms were not designed to tolerate faults. Because of this, just a single fault may stall messages in the network, preventing the finalization of applications, or may lead to deadlocked configurations. In this paper we introduce a fault-tolerant routing method designed to solve a large number of dynamic permanent and non-permanent link faults. As failures appear randomly during system operation, our method provides escape paths for the stalled messages and, at the same time, avoids deadlock occurrences […]

PDF

New List ItemA Multipath Fault-Tolerant Routing Method for High-Speed Interconnection Networks

Full Paper [English]
G. Zarza, D. Lugones, D. Franco, E. Luque (2009)

The intensive and continuous use of high-performance computers for executing computationally intensive applications, coupled with the large number of elements that make them up, dramatically increase the likelihood of failures during their operation. The interconnection network is a critical part of high-performance computer systems that communicates and links together the processing units. Network faults have an extremely high impact because the occurrence of a single fault may prevent the correct finalization of applications. This work focuses on the problem of fault tolerance for high-speed interconnection networks by designing a fault tolerant routing method. The goal is to solve a certain number of link and node failures […]

PDF

Políticas de encaminamiento tolerantes a fallos

Masters Thesis [Spanish]
G. Zarza (2008)

El uso intensivo y prolongado de computadores de altas prestaciones para ejecutar aplicaciones computacionalmente intensivas, sumado al elevado número de elementos que los componen, incrementan drásticamente la probabilidad de ocurrencia de fallos durante su funcionamiento. El objetivo del trabajo es resolver el problema de tolerancia a fallos para redes de interconexión de altas prestaciones, partiendo del diseño de políticas de encaminamiento tolerantes a fallos. Buscamos resolver una determinada cantidad de fallos de enlaces y nodos, considerando sus factores de impacto y probabilidad de aparición. Para ello aprovechamos la redundancia de caminos de comunicación existentes, partiendo desde […]

PDF