Current Projects :
SpamX: the next-generation system for efficient online scalable social spam detection
There are many open problems and challenges that needed to be carefully considered and solved in the future: (1) How to handle the enormous large-scale online social data streams from many sites? Our previous work proposed a kind of available solution, however, at the system-level, it still requires more sophisticated design and implementation in the future. (2) How to guarantee that the social spam can be detected in a timely and effective manner? Our previous work tried to minimize the long processing latency, however, in the face of rapidly changing malicious activities, it still has many open problems such as instantly detect spam from various topics or keywords, seamlessly and quickly handle data from different sites/platforms. (3) How to effectively use the nature or inner relationships of spam in online detection? In order to ensure effective spam detection, how to effectively and efficiently use the attributes of spam at runtime is a challenging and interesting problem. Exploring a new generation of system for processing large-scale online social data streams is a promising topic that requires more research work.
SR3: Customizable Recovery for Stateful Stream Processing Systems
Modern stream processing applications need to store and update state along with their processing, and process live data streams in a timely fashion from massive and geo-distributed data sets. Since they run in a dynamic distributed environment and their workloads may change in unexpected ways, multiple stream operators can fail at the same time, causing severe state loss. However, the state-of-the-art stream processing systems are mainly designed for low-latency intra-datacenter settings and do not scale well for running stream applications that contain large distributed states, suffering a significantly centralized bottleneck and high latency to recover state. They are either slow, resource-expensive or fail to handle multiple simultaneous failures. In this work, we design SR3, a customizable state recovery framework that provides fast and scalable state recovery mechanisms for protecting large distributed states in stream processing systems. SR3 offers three recovery mechanisms --- the star-structured recovery, the line-structured recovery, and the tree-structured recovery --- to cater to the needs of different stream processing computation models, state sizes, and network settings. Our design adopts a decentralized architecture that partitions and replicates states by using consistent ring overlays that leverage distributed hash tables (DHTs). We show that this approach can significantly improve the scalability and flexibility of state recovery.
SpamHunter: detecting online social networks spam at scale
The huge amount of social spam from large-scale social networks has been a common phenomenon in the contemporary world. The majority of former research focused on improving the efficiency of identifying social spam from a limited size of data in the algorithm side, however, few of them target on the data correlations among large-scale distributed social spam and utilize the benefits from the system side. In this paper, we propose a new scalable system, named SpamHunter, which can utilize the spam correlations from distributed data sources to enhance the performance of large-scale social spam detection. It identi- fies the correlated social spam from various distributed servers/sources through DHT-based hierarchical functional trees. These functional trees act as bridges among data servers/sources to aggregate, exchange, and communicate the updated and newly emerging social spam with each other. Furthermore, by processing the online social logs instantly, it allows online streaming data to be processed in a distributed manner, which reduces the online detection latency and avoids the inefficiency of outdated spam posts.
Machine learning guided unifiy memory optimizaiton in GPUs
NVIDIA’s unified memory (UM) creates a pool of managed memory on top of physically separated CPU and GPU memories. UM automatically migrates page-level data on demand so programmers can quickly write CUDA codes on heterogeneous machines without tedious and error-prone manual memory management. To improve performance, NVIDIA allows advanced programmers to pass additional memory use hints to its UM driver. However, it is extremely difficult for programmers to decide when and how to efficiently use unified memory, given the complex interactions between applications and hardware. In this project, we present a machine learning-based approach to choosing between discrete memory and unified memory, with additional consideration of different memory hints. Our approach utilizes profiler-generated metrics of CUDA programs to train a model offline, which is later used to guide opti- mal use of UM for multiple applications at runtime.