5th Workshop on Advances in Software and Hardware for Big Data Sciences (ASH)

In conjunction with 2018 IEEE Conference on Big Data (IEEE BigData 2018)

Dec. 10-13, 2018 @ Seattle, WA, USA


Following the success of the ASH workshop series co-located with IEEE Big data conference in the past four years, we are looking forward to organizing the 5th ASH workshop in 2018.

The ASH workshop has positioned itself as a unique forum for bringing the latest technology development in hardware and software to enable big data science. The topics of the workshop are centered on the accessibility and applicability of the latest hardware and software to practical domain problems and education settings. The workshop will discuss issues in facilitating data-driven discovery with the latest software and hardware technologies for domain researchers, such as performance evaluation, optimization, accessibility, usability, application, and education of new technologies. The presentations and discussions at the workshop will speed and promote the adoption of latest software and hardware technologies for domain researchers working on big data science. 

Data-intensive science has become the fourth paradigm in science and has brought a profound transformation of scientific research. Indeed, data-driven discovery has already happened in various research fields, such as earth sciences, medical sciences, biology, and physics, to name a few.  In brief, a vast volume of scientific data captured by new instruments has been becoming publically accessible for the purposes of continued and deeper data analysis. Big Data analytics result in the development of many new theories and discoveries but also require substantial computational resources in the process. However, the mainstream of many domain sciences still mostly relies on traditional experimental paradigms. It is a crucial issue to make the latest technology advancements in software and hardware accessible and usable to the domain scientists.

Fueled by the big data analytics needs, new computing and storage technologies are also in rapid development and pushing for new high-end hardware geared for solving big data problems. These new hardware advances bring new opportunities for performance improvement but also new challenges. The overall performance bottleneck of a problem can be shifted, requiring different workload balancing strategy due to the significant performance boost of a particular hardware. While those technologies have the potential to greatly improve the capabilities in big data analytics and make significant contributions to data-driven science, it is even more important to make those technologies understood and accessible by data scientists early.

In the recent years, analysis algorithms and software for machine learning have boomed. Deep neural network based methods begin to make buzz in nearly every domain fields. There are a dozen open source deep learning frameworks developed in last year alone. Comprehensive open source analytic software environments and platforms are also evolving with these new developments for data science. Therefore, how to efficiently utilize these latest technologies to solve big data problems in scientific domains and how to facilitate continuing innovations in computer science with these latest technologies are two central focuses of this workshop.

We anticipate workshop participation from computer scientists, domain users, service providers, educators, and technology practitioners in industry. We intend to invite cyber-infrastructure specialists to share their experiences with the latest hardware and software advancements, data scientists to share their experiences and perspectives in using those technologies for data-driven discovery, and educators to share their stories in educating big data theories, computing foundations, and essential tools and resources.

Research Topics of interest include, but are not limited to

·         Adopt latest hardware technology with for Big Data analytics

·         Using high performance computing resources, cyber-infrastructures, and large systems for Data to knowledge discovery

·         Analysis, visualization, and retrieval of large-scale data sets

·         Application and use cases in using novel tools and resources for Big Data in sciences and engineering

·         Service-oriented architectures to enable data science

·         Big data and interactive analysis languages (e.g., R, Python, Scala, and Matlab) and cloud-based analytical platforms

·         Demonstrations and evaluations of latest software tools and hardware technologies

·         Education of data theory, computing foundation, and data infrastructure for data science

·         Applications and methods of machine learning and deep neuron networks with big data set


Important dates

·         Oct. 10, 2018 Oct. 19, 2018: Due date for full workshop papers submission

·         Nov. 5, 2018: Notification of paper acceptance to authors

·         Nov. 15, 2018: Camera-ready of accepted papers

·         Dec. 10-13, 2018: Workshops


Program chairs

·         Hui Zhang (University of Louisville)

·         Weijia Xu (Texas Advanced Computing Center)

·         Hongfeng Yu (University of Nebraska)



Program Committee

·         Dan Stanzione (Texas Advanced Computing Center) 

·         Eric Wernert (Pervasive Technology Institute/Indiana University)

·         Nirav Merchant (University of Arizona)

·         J. Ray Scott (Pittsburg Supercomputing Center)

·         Ian Foster (Argonne National Laboratory)

·         George Ostrouchov (Oak Ridge National Lab/UTK)

·         Jian Li (Huawei Technology Inc.)

·         Avishkar Misra (Oracle Inc.)

·         Dhabaleswar K. Panda (Ohio State University)

·         Chaoli Wang (University of Notre Dame)

·         Robert Hsu (Chung Hua University, Taiwan)

·         Frank Zou (Worcester Polytechnic Institute)

·         Cherry Liu (Georgia Tech)

·         Guangchen Ruan (Research Technology, Indiana University)

·         Tiejun Li (National University of Defense Technology, China)

·         Rui Mao (Shenzhen University, China)



Paper Submission

Submit your paper to ASH’18.


Papers should be formatted to 10 pages IEEE Computer Society Proceedings Manuscript Formatting Guidelines (see link to "formatting instructions" below).

Formatting Instructions:
8.5" x 11" (DOC, PDF) 
LaTex Formatting Macros 


Although we accept submissions in the form of PDF, PS, and DOC/RTF files, you are strongly encouraged 
to generate a PDF version for your paper submission if your paper was prepared in Word. 

Final Program

The 5th Workshop on Advances in Software and Hardware for Big Data Sciences

WorkshopChairs: Hui Zhang, Weijia Xu, Hongfeng Yu




2:30 – 2:55pm

Scalable Record Linkage

Luke Wolcott

2:55pm – 3:20pm

Performance Analysis of Divide-and-Conquer strategies for Large scale Simulations in R
Ranjini Subramanian

3:20pm – 3:45pm

Integrated HPC Scheduler Data Processing Workflow using Apache Zeppelin
Fang Liu

3:45pm – 4:10pm

Untangling Mathematical Knots with Simulated Annealing and Opposition-Based Learning
Juan Lin

4:10pm – 4:30pm

Coffee Break

4:30pm – 4:55pm

3D Reconstruction of Plant Leaves for High-Throughput Phenotyping
Feiyu Zhu


4:55pm – 5:20pm

A Low-Overhead Integrity Verification for Big Data Transfers
Engin Arslan


5:20pm – 5:45pm

Enabling User Driven Big Data Application on Remote Computing Resources

Weijia Xu

5:45pm – 6:10pm

Scaling Collaborative Filtering with PETSc


6:10pm – 6:30pm

Closing Remarks



To attend the workshop, you will need to register with the IEEE Big Data 2018 Conference.


Hotel Information

Book hotel rooms at conference group rate at here.

Previous ASH Workshops

·         ASH 2017

·         ASH 2016

·         ASH 2015

·         ASH 2014