* Call for papers *
3rd Workshop on Algorithms and Systems for MapReduce and Beyond, July 1, 2016.
Held in conjunction with SIGMOD 2016
San Francisco, USA, June 26th – July 1st, 2016
Author: Ion Stoica, AMPLab, University of California Berkeley
Title: Spark: Past, Present, and Future
Abstract: Almost six years ago we started the Spark project at UC Berkeley.
Spark is a cluster computing engine that is optimized for in-memory
processing, and unifies support for a variety of workloads, including
batch, interactive querying, streaming, and iterative computations. Spark
is now the most active big data project in the open source community, and
is already being used by over one thousand organizations. In this talk,
I’ll take a look back at Spark’s humble beginnings, discuss it’s current
status, and the new and exciting developments that are coming up.
Author: Carlos Guestrin, University of Washington
Title: Big Data, Small Cluster: Choosing “big memory” (RAM, disks, SSDs) over big clusters
The third BeyondMR workshop aims to explore algorithms, computational
models, architectures, languages and interfaces for systems that need
large-scale parallelization and systems designed to support efficient
parallelization and fault tolerance. These include specialized programming
and data-management systems based on MapReduce and extensions, graph
processing systems, data-intensive workflow and dataflow systems.
We invite submissions on topics such as
Frameworks for Large-Scale Analytical Processing:
– Models, architectures and languages for data processing pipelines,
data-intensive workflows, DAGs of operations/MapReduce jobs, dataflows,
– Extensions of MapReduce with more fundamental functions other than Map
and Reduce and more complex dataflow connections between function inputs
– Expressing and parallelising iterations, incremental iterations, and
programs consisting of large DAGs of operations.
– Approaches to achieving fault tolerance and to recovering from failures.
Algorithms for Large-Scale Data Processing:
– Methods and techniques for designing efficient algorithms for MapReduce
and similar systems.
– Experiments and experience with new algorithms in these settings.
Cost Models and Optimization Techniques:
– Formal definitions of models that evaluate the efficiency of algorithms
in large-scale parallel processing systems taking into account the
requirements of such systems in different applications.
– Testing and benchmarking of MapReduce extensions and data-intensive
Resource Management for Many-Task Computing:
– Scheduling of tasks and load-balancing techniques.
– Methods to tackle data skewness.
– Study of cases where automatic data distribution in MapReduce and
similar systems does not provide sufficient data balancing.
– Design of algorithms that avoid skewness.
– Extensions of MapReduce that automatically tackle data skewness.
Papers submission deadline: Sun March 5, 2016
Authors notification: Sun April 11, 2016
Deadline for camera-ready copy: Sun May 1, 2016
Workshop: Fri July 1, 2016
We invite full research or experience papers (up to 10 pages), or short
papers (up to 4 pages) describing research in progress, formatted using
the ACM double-column style
The workshop proceedings will be published in ACM DL and the organizers will prepare a SIGMOD Record report.
Foto Afrati (National Technical University of Athens, Greece)
Jan Hidders (TU Delft, The Netherlands)
Christopher Re (Stanford, USA)
Jacek Sroka (University of Warsaw, Poland)
Jeffrey Ullman (Stanford University)
Program Committee (in progress)
– Chris Re, Stanford University (PC chair)
– Foto Afrati, National Technical University of Athens
– Jeffrey Ullman, Stanford University
– Jacek Sroka, University of Warsaw
– Jan Hidders, Delft University of Technology
– Zhengkui Wang, Singapore Institute of Technology
– Khalid Belhajjame, PSL, Universite Paris-Dauphine, LAMSADE
– Sourav Bhowmick, Nanyang Technological University
– Graham Cormode, University of Warwick
– Asterios Katsifodimos, Technical University of Berlin
– Paris Koutris, University of Washington
– Dionysios Logothetis, Facebook
– Frank McSherry, ETH Zurich
– Krzysztof Onak, IBM Research
– Mark Santcroos, Rutgers University
– Gautam Shroff, Tata Consultancy Services RD
– Dan Suciu, University of Washington
– Jianwu Wang, University of Maryland, Baltimore County
– Tim Kraska, Brown University
– Krzysztof Rzadca, University of Warsaw
– Semih Salihoglu, Stanford University
– Ulf Leser Humboldt-Universität zu Berlin
– Fabio Porto National Laboratory of Scientific Computation, Brasil
– Eiko Yoneki University of Cambridge
– Umut Acar Carnegie Mellon University
– Daniel De Oliveira Fluminense Federal University
– Tamer Özsu University of Waterloo
– Anthony Tung National University of Singapore
– Sergei Vassilvitskii Google
– Yogesh Simmhan Indian Institute of Science, Bangalore