6th Workshop on Performance Engineering with Advances in Software and Hardware for Big Data Sciences (PEASH)

In conjunction with 2019 IEEE Conference on Big Data (IEEE BigData 2019)

Dec. 09-12, 2019 @ Los Angeles, CA, USA

Introduction

Following the success of the PEASH (formerly ASH) workshop series co-located with IEEE Big data conference in the past four years, we are looking forward to organizing the 6th PEASH workshop in 2018.

The PEASH workshop has positioned itself as a unique forum for bringing the latest technology development in hardware and software to enable big data science. The topics of the workshop are centered on the accessibility and applicability of the latest hardware and software to practical domain problems and education settings. The workshop will discuss issues in facilitating data-driven discovery with the latest software and hardware technologies for domain researchers, such as performance evaluation, optimization, accessibility, usability, application, and education of new technologies. The presentations and discussions at the workshop will speed and promote the adoption of latest software and hardware technologies for domain researchers working on big data science. 

Data-intensive science has become the fourth paradigm in science and has brought a profound transformation of scientific research. Indeed, data-driven discovery has already happened in various research fields, such as earth sciences, medical sciences, biology, and physics, to name a few.  In brief, a vast volume of scientific data captured by new instruments has been becoming publically accessible for the purposes of continued and deeper data analysis. Big Data analytics result in the development of many new theories and discoveries but also require substantial computational resources in the process. However, the mainstream of many domain sciences still mostly relies on traditional experimental paradigms. It is a crucial issue to make the latest technology advancements in software and hardware accessible and usable to the domain scientists.

Fueled by the big data analytics needs, new computing and storage technologies are also in rapid development and pushing for new high-end hardware geared for solving big data problems. These new hardware advances bring new opportunities for performance improvement but also new challenges. The overall performance bottleneck of a problem can be shifted, requiring different workload balancing strategy due to the significant performance boost of a particular hardware. While those technologies have the potential to greatly improve the capabilities in big data analytics and make significant contributions to data-driven science, it is even more important to make those technologies understood and accessible by data scientists early.

In the recent years, analysis algorithms and software for machine learning have boomed. Deep neural network based methods begin to make buzz in nearly every domain fields. There are a dozen open source deep learning frameworks developed in last year alone. Comprehensive open source analytic software environments and platforms are also evolving with these new developments for data science. Therefore, how to efficiently utilize these latest technologies to solve big data problems in scientific domains and how to facilitate continuing innovations in computer science with these latest technologies are two central focuses of this workshop.

We anticipate workshop participation from computer scientists, domain users, service providers, educators, and technology practitioners in industry. We intend to invite cyber-infrastructure specialists to share their experiences with the latest hardware and software advancements, data scientists to share their experiences and perspectives in using those technologies for data-driven discovery, and educators to share their stories in educating big data theories, computing foundations, and essential tools and resources.

Research Topics of interest include, but are not limited to

·         Adopt latest hardware technology (e.g., multicore, gpgpu, Intel coprocessors, HPC, cloud) for Big Data analytics

·         Using high performance computing resources, cyber-infrastructures, and large systems for accelerating data to knowledge discovery

·         Performance analysis, engineering, and evaluation for big data solutions

·         Analysis, visualization, and retrieval of large-scale data sets

·         Application and use cases in using novel tools and resources for Big Data in sciences and engineering

·         Service-oriented architectures to enable data science

·         Big data and interactive analysis languages (e.g., R, Python, Scala, and Matlab) and cloud-based analytical platforms

·         Demonstrations and evaluations of latest software tools and hardware technologies

·         Education of data theory, computing foundation, and data infrastructure for data science

·         Applications and methods of machine learning and deep neuron networks with big data set

 

Final Program

PEASH’19 (December 10th)

WorkshopChairs: Hui Zhang, Weijia Xu, Hongfeng Yu

Time

Title

Presenter/Author

8:00am – 8:15am

PEASH’19 Opening Remarks

8:15am – 8:35am

Parallel R Computing on the Web

Ranjini Subramanian

8:35am – 8:55am

An Evaluation of RDMA-based Message Passing Protocols

Shahram Ghandeharizadeh

8:55am – 9:15am

Parallel Training via Computation Graph Transformation

Fei Wang

9:15am – 9:35am

Accelerating RNN on FPGA with Efficient Conversion of High-Level Designs to RTL

Zongze Li

9:45am – 10:05am

Coffee Break

10:05am – 10:25am

Parallelized Topological Relaxation Algorithm

Guangchen Ruan

10:25am – 10:45am

Transparent In-memory Cache Management in Apache Spark based on Post-Mortem Analysis

Atsuya Nasu

10:45am – 11:05am

A Fast Exact Viewshed Algorithm on GPU

Faisal Qarah

11:05am – 11:25am

Spatial-Temporal Scientific Data Clustering via Deep Convolutional Neural Network

Jianxin Sun

11:25am – 11:45am

A GPU based parallel algorithm for computing the Sparse Fast Fourier Transform (SFFT) of k-sparse signals

Fahad Saeed

12:10pm – 2:00pm

Lunch Break

2:20pm – 2:40pm

Plant Event Detection from Time-Varying Point Clouds

Tian Gao

2:40pm – 3:00pm

Parallel Hybrid Metaheuristics with Distributed Intensification and Diversification for Large-scale Optimization in Big Data Statistical Analysis

Wendy Tam

3:00pm – 3:20pm

An "On The Fly" Framework for Efficiently Generating Synthetic Big Data Sets

Karm Mason

3:20pm – 3:40pm

Auto-CNNp: a component-based framework for automating CNN parallelism

Soulaimane GUEDRIA

3:40pm – 4:00pm

Constructing Suffix Array of Next-Generation Sequencing upon In-Memory Lookup Cloud and MapReduce

Meng-Huang Lee

4:00pm – 4:20pm

Coffee Break

 

Program chairs

·         Hui Zhang (University of Louisville)

·         Weijia Xu (Texas Advanced Computing Center)

·         Hongfeng Yu (University of Nebraska)

 

 

Program Committee

·         Dan Stanzione (Texas Advanced Computing Center) 

·         Eric Wernert (Pervasive Technology Institute/Indiana University)

·         Nirav Merchant (University of Arizona)

·         J. Ray Scott (Pittsburg Supercomputing Center)

·         Ian Foster (Argonne National Laboratory)

·         George Ostrouchov (Oak Ridge National Lab/UTK)

·         Jian Li (Huawei Technology Inc.)

·         Avishkar Misra (Oracle Inc.)

·         Dhabaleswar K. Panda (Ohio State University)

·         Chaoli Wang (University of Notre Dame)

·         Robert Hsu (Chung Hua University, Taiwan)

·         Frank Zou (Worcester Polytechnic Institute)

·         Cherry Liu (Georgia Tech)

·         Guangchen Ruan (Research Technology, Indiana University)

·         Tiejun Li (National University of Defense Technology, China)

·         Rui Mao (Shenzhen University, China)

 

 

Paper Submission

Submit your paper to PEASH’19.

1)

Papers should be formatted to 10 pages IEEE Computer Society Proceedings Manuscript Formatting Guidelines (see link to "formatting instructions" below).

Formatting Instructions:
8.5" x 11" (DOC, PDF) 
LaTex Formatting Macros 

2)

Although we accept submissions in the form of PDF, PS, and DOC/RTF files, you are strongly encouraged 
to generate a PDF version for your paper submission if your paper was prepared in Word. 

 

 

Registration

To attend the workshop, you will need to register with the IEEE Big Data 2019 Conference.

 

Hotel Information

Book hotel rooms at conference group rate at here.

Previous ASH Workshops

·         ASH 2018

·         ASH 2017

·         ASH 2016

·         ASH 2015

·         ASH 2014