Okapi is an open source library of graph analytics and machine learning algorithms for the Giraph graph processing system that is part of the project. Currently, it contains algorithms for collaborative filtering and graph mining. Our plan is to build a community around the project and enrich it with more toolkits. Check out a nice post about Okapi with more details from Claudio Martella, one of the contributors of the project.
We’ve recently launched Grafos.ML our new project on graph mining and machine learning. The goal of the project is to develop tools for large-scale graph mining and ML analytics. Our first effort is Okapi, a library of graph mining and machine learning algorithms developed for the Giraph graph processing system. Check out the site for more information.
Our paper on debugging systems for data-intensive analytics got accepted to the ACM Symposium on Cloud Computing. The paper presents Newt, a scalable architecture for capturing and querying data lineage information, to find and resolve errors in processing pipelines.
Newt provides a ﬂexible instrumentation API that allows system developers to collect ﬁne-grain lineage from a range of data intensive scalable computing (DISC) architectures. Newt pairs this API with a scale-out, fault-tolerant lineage store and query engine.
Until the camera-ready version, take a look at the technical report here.
Consider submitting your work at the 1st Workshop on Large-Scale Recommender Systems (LSRS), which will be co-located with RecSys’13.
Consider submitting your work to this year’s IEEE International Conference on Peer-to-Peer Computing.