Events

Third Annual SIGHPC-Big Data Meeting (SC18 – Kay Bailey Hutchison Convention Center, Dallas, TX)

Date:   Wed. November 14, 2018 at 12:15 – 1:15 pm, Room D169

Session Chairs: 

Dr. Stratos Efstathiadis, New York University 

Suzanne McIntosh, New York University

Abstract:   The ACM SIGHPC Big Data Virtual Chapter will host its third BoF at SC18. The BoF is open to everyone interested in the convergence of HPC and Big Data. We are pleased to announce the following presentations (Read more):  

Andras Pataki, Flatiron Institute

Title: “Ceph implementation at Flatiron”

Christopher N. Hill, Department of Earth, Atmosphere, and Planetary Sciences, MIT

Title: “Big Data and Big Models”

Lucas A. Wilson, Dell EMC HPC and AI Engineering

Title: “Getting Smarter Faster: The Intersection of HPC and AI”

Andy Watson, WekaIO

Title: “Storage Challenges for Machine Learning at Scale”


Second Annual SIGHPC-Big Data Meeting (SC17)

Date:   Wed. November 15, 2017 at 12:15 – 1:15 pm

Session Chairs: 

Dr. Stratos Efstathiadis, New York University 

Suzanne McIntosh, New York University

Abstract:   The ACM SIGHPC Big Data Virtual Chapter will host its third BoF at SC18. The BoF is open to everyone interested in the convergence of HPC and Big Data. We are pleased to announce that we will have the following presentations (Read more): 

Curtis Hillegas, Princeton University (slides)

Title: “Successes and Challenges of merging HPC and Big Data at Princeton University”

Dhabaleswar K. (DK) Panda, The Ohio State University

Title: “Big Data Meets HPC: Exploiting HPC Technologies for Accelerating Big Data Processing and Management”

Hatem Ltaief, Extreme Computing Research Center, KAUST (slides)

Title: “HPC and Big Data convergence: Remaining Challenges”

Jeff Denworth, VAST Data

Title: “Breaking the Tradeoffs That Have Separated HPC and Big Data Storage Architectures”


First Annual SIGHPC-Big Data Meeting (SC16)

Date:   Tues. November 15, 2016 at 10:30 am – 12 noon

Session Chairs: 

Dr. Stratos Efstathiadis, New York University 

Suzanne McIntosh, New York University and Cloudera, Inc.

Abstract:   The goal of the BoF is to gather for the first time members and non-members of the SIGHPC Big Data Virtual Chapter who are interested in learning about the challenges of converging Big Data and HPC. The BoF will give people the opportunity to hear about existing challenges and openly discuss solutions, tools and new approaches on how to best utilize available Big Data and HPC resources. We are pleased to announce that we will have the following presentations (Read more): 

Scott Yockel, Harvard Research Computing

Title: “Big Data, where doesn’t it come from, and how I deal with it?”

Harry Mangalam, UC Irvine Research Computing

Title:BeeGFS in real life” – BeeGFS-SC-2016


Webinar:   Wed. December 2, 2016 at 1pm

Title: “Social Network Analysis on Cray’s Urika to Discover Roles of Information Flow”

Presenter:   

Mike Hinchey is a Solutions Architect for Cray’s Analytics Products Group, where he works with customers to explore their data and demonstrate the possibilities of various analytic technologies on big data.  He specializes in the use of graph algorithms, Sparql, SQL and Spark.

Abstract:   

With the advent of social media and with more people sharing more personal information online nowadays, it is now possible to gain insight into social activities that until now was very hard. Social Network Analytics is a powerful technique that allows researchers to bring together data from a multitude of online sources and to use this data to discover hidden relationships. In particular, it is now possible to discover communities of users with interest in a specific topic or to identify users according to the role that they may play within their social hierarchy eg: originators, key influencers, rebroadcasters, connectors etc.

At Cray, we have a long history of doing relationship analytics using very specialized tools and platforms for this purpose. In this seminar, we intend to discuss and demonstrate the use of new generation analytic techniques based on Hadoop and Spark on Twitter data to find communities of users that discuss certain topics like sports or consumer electronics and identify key users that play a role in that community.