HTML observatory

WP 2 – Big Data Platforms & Infrastructure

2.1 Big Data Platforms and Infrastructure

Cloudera CDH
Hortonworks HDP
MapR
Apache Ambari

2.2. Big Data File / Storage Systems

Scalable SQL and NoSQL data stores
HDFS
HBase Administration Cookbook
Solving Big Data Challenges for Enterprise Application Performance Management
Sql vs NoSql
Apache Cassandra
Apache Hbase

2.3. Big Data Batch Processing

Apache Flink
Hadoop
Spark
MapReduce: Simplified Data Processing on Large Clusters
Apache Hadoop MapReduce

2.4. Big data Stream Processing

Apache Spark Streaming
Kafka Streaming
Apache Storm
Benchmarking Streaming Computation Engines: Storm, Flink and Spark Streaming
Diving into Apache Spark Streaming’s Execution Model. Databricks Engineering Blog.

2.5. Connectors

Kafka Connect
Spark packages

WP 3 – Big Data Management and Processing

Scalable SQL and NoSQL data stores
A Survey on NoSQL Stores
A survey of large-scale analytical query processing in MapReduce
Spatial Partitioning Techniques in Spatial Hadoop
The Era of Big Spatial Data: A Survey. Foundations and Trends in Databases
Big data and its technical challenges

3.1. Big Data Storage and Indexing

MD-HBase: A Scalable Multi-dimensional Data Infrastructure for Location Aware Services
ST-HBase: A Scalable Data Management System for Massive Geo-tagged Objects
R-HBase:A Multi-Dimensional Indexing Framework for Cloud Computing Environment
Pyro: A Spatial-Temporal Big-Data Storage System
ST-hash: An efficient spatiotemporal index for massive trajectory data in a NoSQL database.
PERFORMANCE EVALUATION OF SQL AND NOSQL DATABASE MANAGEMENT SYSTEMS IN A CLUSTER
Spatio-temporal Indexing in Non-relational Distributed Databases
A Comparative Study of Secondary Indexing Techniques in LSM-based NoSQL Databases
Modeling and Indexing Spatiotemporal Trajectory Data in Non-Relational Databases
Data-driven generation of spatio-temporal routines in human mobility

3.2 Big Data Processing

Storm@twitter
Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing
Apache Spark: a unified engine for big data processing
Parallel and Distributed Processing of Reverse Top-k Queries

3.2.1. Spatial and Spatio-temporal Frameworks

Hadoop-GIS: A High Performance Spatial DataWarehousing System over MapReduce
Parallel SECONDO: Practical and Efficient Mobility Data Processing in the Cloud
SpatialHadoop: A MapReduce Frameworkfor Spatial Data
AQWA: Adaptive Query-Workload-Aware Partitioning of Big Spatial Data
Large-Scale Spatial Join Query Processing in Cloud
GeoSpark: A Cluster Computing Framework forProcessing Large-Scale Spatial Data
LocationSpark: A Distributed In-Memory DataManagement System for Big Spatial Data
Simba: Efficient In-Memory Spatial Analytics
ST-Hadoop: A MapReduce Frameworkfor Spatio-Temporal Data
Efficient spatio-temporal event processing with STARK
Big Spatial Data Processing Frameworks:Feature and Performance Evaluation
Spatio-Temporal Join on Apache Spark
Distributed processing of big mobility data as spatio-temporal data streams
Pigeon: A spatial MapReduce language
Simba: spatial in-memory big data analysis
Large-scale spatial join query processing in Cloud

3.2.2. Trajectory Management Frameworks

UlTraMan: A Unified Platform for Big Trajectory DataManagement and Analytics
DITA: Distributed In-Memory Trajectory Analytics
Trajectory Similarity Join in Spatial Networks
TrajSpark: A Scalable and Efficient In-MemoryManagement System for Big Trajectory Data
A Demonstration of STHadoop: A MapReduce Framework for Big Spatiotemporal Data

3.3 Map Matching

Hidden Markov Map Matching Through Noise and Sparseness
Online map-matching based on Hidden Markov model for real-time traffic sensing applications

WP 4 – Big Data Analytics

4.1 Knowledge Discovery in Big Data

4.1.1 Clustering

Trajectory Clustering via Deep Representation Learning
Clustering Large-Scale Origin-Destination Pairs: A Case Study for Public Transit in Beijing
OPTICS: ordering points to identify the clustering structure
Discovery of collocation episodes in spatiotemporal data
Highly Scalable Sequential Pattern Mining Based on MapReduce Model on the Cloud
A scalable and fast OPTICS for clustering trajectory big data
UlTraMan: A Unified Platform for Big Trajectory Data Management and Analytics
A density-based algorithm for discovering clusters in large spatial databases with noise
A general and parallel platform for mining co-movement patterns over large-scale trajectories
A parallel GPU-based approach for reporting flock patterns
Parallel distributed trajectory pattern mining using MapReduce
On discovering moving clusters in spatio-temporal data
Discovering Evolving Moving Object Groups from Massive-Scale Trajectory Streams
Discovering relative motion patterns in groups of moving point objects
Finding REMO – detecting relative motion patterns in geospatial lifelines
Trajectory clustering: a partition-and-group framework
High performance FPGA and GPU complex pattern matching over spatio-temporal streams
Stream-mode fpga acceleration of complex pattern trajectory querying
Time-focused clustering of trajectories of moving objects
Segmentation and sampling of moving object trajectories based on representativeness
Scalable parallel OPTICS data clustering using graph algorithmic techniques
Clustering uncertain trajectories
Unsupervised trajectory sampling
On temporal-constrained sub-trajectory cluster analysis
On-line discovery of hot motion paths
GPU parallel algorithms for reporting movement behaviour patterns
Trajectory Data Mining: An Overview
Mobility Data Management and Exploration.

4.1.2. Sequential Pattern Mining

Mining association rules between sets of items in large databases.
Mining sequential patterns
A framework of enroute air traffic conflict detection and resolution through complex network analysis
Mining frequent patterns by pattern-growth: methodology and implications.
Human mobility, social ties, and link prediction
CLAP: Collaborative pattern mining for distributed information systems.

4.1.3. Hot-spot Analysis

Hot-spot Analysis over Big Trajectory Data
A Demonstration of ST-Hadoop: A MapReduce Framework for Big Spatio-temporal Data
ST-Hadoop: A MapReduce Framework for Spatio-Temporal Data.
Data mining for air traffic flow forecasting: a hybrid model of neural network and statistical analysis.
Parallel and Distributed Processing of Spatial Preference Queries using Keywords.
The Era of Big Spatial Data: A Survey.
Scalable Algorithms for Nearest-Neighbor Joins on Big Trajectory Data.
Efficient spatio-temporal event processing with STARK. Proceedings of EDBT.
Local Spatial Autocorrelation Statistics: Distributional Issues and an Application
LocationSpark: A Distributed In-Memory Data Management System for Big Spatial Data.
Spatio-Temporal Join on Apache Spark.
Parallel gathering discovery over big trajectory data
Maritime data integration and analysis: recent progress and research challenges. Proceedings of EDBT.

4.1.4. Future Location Prediction

Identifying Human Mobility via Trajectory Embeddings
An adaptive location prediction model based on fuzzy control
LeZi-update: an information-theoretic approach to track mobile users in PCS networks.
Where will you go? Mobile data mining for next place prediction.
Statistical prediction of aircraft trajectory: regression methods vs point-mass model. Proceedings of ATM.
Extracting mobility statistics from indexed spatio-temporal datasets.
On indexing line segments
Query and update efficient B+-tree based indexing of moving objects. Proceedings of VLDB.
The Use of Landscape Metrics and Transfer Learning to Explore Urban Villages in China.
WhereNext: a location predictor on trajectory pattern mining. Proceedings of ACM SIGKDD.
Prediction and indexing of moving objects with unknown motion patterns.
The TPR*-tree: an optimized spatio-temporal access method for predictive queries.
A quadtree-based dynamic attribute indexing method
MyWay: Location Prediction via mobility profiling.
Mining spatio-temporal association rules, sources, sinks, stationary regions and thoroughfares in object mobility databases.
TrajPattern: Mining sequential patterns from imprecise trajectories of mobile objects.
A data mining approach for location prediction in mobile environments.
Semantic trajectory mining for location prediction.
Predicting object trajectories from high-speed streaming data.
The Design and Analysis of Spatial Data Structures. Addison-Wesley.
Compression of individual sequences via variable-rate coding.

4.1.5. Trajectory Prediction

LSTM-based Flight Trajectory Prediction
Predicting Aircraft Trajectories A Deep Generative Convolutional Recurrent Neural Networks Approach
Trajectory Length Prediction for Intelligent Traffic Signaling A Data-Driven Approach
Exploiting AIS Data for Intelligent Maritime Navigation A Comprehensive Survey From Data to Methodology
Time Series Clustering of Weather Observations in Predicting Climb Phase of Aircraft Trajectories. Proceedings of IWCTS 2016.
Discovering popular routes from trajectories.
Detecting flight trajectory anomalies and predicting diversions in freight transportation. Decision Support Systems 88 (2016)
A methodology for automated trajectory prediction analysis. Proceedings of AIAA GNC 2004.
Online learning for ground trajectory prediction
Predestination: inferring destinations from partial trajectories. Proceedings of UbiComp 2003.
Using neural networks to predict aircraft trajectories.
Stochastic optimal control for aircraft conflict resolution under wind uncertainty
A tutorial on hidden Markov models and selected applications in speech recognition
Common Trajectory Prediction Capability for Decision Support Tools.
An improved trajectory prediction algorithm based on trajectory data mining for air traffic management.
Pattern Recognition (ISBN: 9781597492720)
Adaptive Algorithm to Improve Trajectory Prediction Accuracy of Climbing Aircraft
Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Transactions on Information Theory 13
Big data analytics for time critical mobility forecasting: recent progress and research challenges
Terminal-area aircraft intent inference approach based on online trajectory clustering. The Scientific World Journal, 671360 (2015)
A machine learning approach to trajectory prediction. Proceedings of AIAA GNC 2013.
Aircraft trajectory forecasting using local functional regression in Sobolev space. Transportation research part C: Emerging technologies 39 (2014)
Aircraft Trajectory Prediction Made Easy with Predictive Analytics

4.1.6. Geographical Transfer Learning and Mobility Data

Varying-coefficient models for geospatial transfer learning
Transferring Knowledge of Activity Recognition across Sensor Networks
Dynamic Context-Aware Event Recognition Based on Markov Logic Networks
A Survey on Transfer Learning
Statistical transfer learning: A review and some extensions to statistical process control, Quality Engineering
Parallel Sequential Pattern Mining by Transaction Decomposition
The Use of Landscape Metrics and Transfer Learning to Explore Urban Villages in China
Transfer Knowledge between Cities

4.1.7. Driver Profiling

A Bayesian Network model for contextual versus non-contextual driving behavior assessment
A context aware system for driving style evaluation by an ensemble learning on smartphone sensors data
A framework for evaluating aggressive driving behaviors based on in-vehicle driving records
A smartphone based technique to monitor driving behavior using DTW and crowdsensing
An analysis on older driver’s driving behavior by GPS tracking data Road selection, left right turn, and driving speed
An inference engine for smartphones to preprocess data and detect stationary and transportation modes
An online estimation of driving style using data-dependent pointer model
An Overview on Study of Identification of Driver Behavior Characteristics for Automotive Control
Applications of wavelet transform for analysis of freeway traffic Bottlenecks, transient traffic, and traffic oscillations
Assessing safety critical braking events in naturalistic driving studies
Can vehicle longitudinal jerk be used to identify aggressive drivers An examination using naturalistic driving data
Combining speed and acceleration to define car users’ safe or unsafe driving behaviour
Crash prediction with behavioral and physiological features for advanced vehicle collision avoidance system
Crash risk analysis during fog conditions using real-time traffic data
Data Collection for Traffic and Drivers’ Behaviour Studies A Large-scale Survey
Deriving Personal Trip Data from GPS Data A Literature Review on the Existing Methodologies
Design of Driving Behavior Pattern Measurements Using Smartphone Global Positioning System Data
Detecting trip purposes from smartphone-based travel surveys with artificial neural networks and particle swarm optimization
Development of a method for detecting jerks in safety critical events
Driver behavior profiling An investigation journal
Driver Behavior Profiling Using Smartphones A Low-Cost Platform for Driver Monitoring
Driver behavior with a smartphone collision warning application – A field study
Driver behaviour data linked with vehicle, weather, road surface, and daylight data
Driver behaviour profiles for road safety analysis
Driver sleepiness, fatigue, careless behavior and risk of motor vehicle crash and injury Population based case and control study
Driving analytics using smartphones Algorithms, comparisons and challenges
Driving in Traffic – Short-Range Sensing for Urban Collision Avoidance
Estimation of driving style in naturalistic highway traffic using maneuver transition probabilities
Evaluating changes in driver behaviour A risk profiling approach
Evaluation of the predictability of real-time crash risk models
Freeway traffic oscillations Microscopic analysis of formations and propagations using Wavelet Transform
Hybrid of discrete wavelet transform and adaptive neuro fuzzy inference system for overall driving behavior recognition
Identifying crash-prone traffic conditions under different weather on freeways
Impact of real-time traffic characteristics on crash occurrence Preliminary results of the case of rare events
Impact of real-time traffic characteristics on freeway crash occurrence Systematic review and meta-analysis
Impact of traffic oscillations on freeway crash occurrences
Individual driver risk assessment using naturalistic driving data
Inferring transportation modes from GPS trajectories using a convolutional neural network
Influence of driver characteristics on emissions and fuel consumption
Influential factors of red-light running at signalized intersection and prediction using a rare events logistic regression model
Jerky driving. An indicator of accident proneness
On the periodicity of traffic oscillations and capacity drop The role of driver characteristics
Performance of basic kinematic thresholds in the identification of crash and near-crash events within naturalistic driving data
Profiling drivers’ risky behaviour towards all road users
Real-time crash prediction in an urban expressway using disaggregated data
Risky Driving Behaviours Among Canadian Adults – A Cluster Analysis (Stephanie Hughes, MSc, CHE, April 2016)
Road Accident Driver Behaviour, Learning and Driving Task
Safety analytics for integrating crash frequency and real-time risk modeling for expressways
Short sleep duration, sleep disorders, and traffic accidents
Smartphones as an integrated platform for monitoring driver behaviour The role of sensor fusion and connectivity
Studying Driving Risk Factors using Multi-Source Mobile Computing Data
Traffic Psychology and Driver Behavior
Transferability and robustness of real-time freeway crash risk assessment
Transportation mode recognition using GPS and accelerometer data
Using Low-cost Smartphone Sensor Data for Locating Crash Risk Spots in a Road Network
Using the Bayesian updating approach to improve the spatial and temporal transferability of real-time crash risk prediction models
Utilizing the eigenvectors of freeway loop data spatiotemporal schematic for real time crash prediction
Vehicle classification from low-frequency GPS data with recurrent neural networks
Vehicle classification using GPS data

4.2. Complex Network Analysis in Big Data

4.2.1. Complex Networks

The role of social networks in information diffusion
A measurement- driven analysis of information propagation in the flickr social network
Community detection in graphs
Temporal networks
Link prediction in multiplex online social networks
Cascading Behavior in Large Blog Graphs
The link prediction problem for social networks
Link prediction in complex networks: a survey
The structure and function of complex networks
Community Discovery in Dynamic Networks: A Survey
Homophilic network decomposition: a community-centric analysis of online social services
Fighting computer virus attacks
Understanding the spread of malicious mobile-phone programs and their damage potential

4.2.2. Mobility Data Analysis with Networks

ComeTogether: Discovering Communities of Places in Mobility Data
Human mobility and spatial disease dynamics
Inferring the root cause in road traffic anomalies
Understanding individual human mobility patterns
A complex network analysis of human mobility
The purpose of motion: Learning activities from individual mobility networks
A complex network perspective for characterizing urban travel demand patterns: graph theoretical analysis of large-scale origin–destination demand networks
The Structure of Borders in a Small World
Graph-based mobility model for mobile ad hoc network simulation
Ridesourcing Car Detection by Transfer Learning
Complexity in human transportation networks: a comparative analysis of worldwide air transportation and global cargo-ship movements
Exploring Human Mobility Patterns in Urban Scenarios: A Trajectory Data Perspective
Urban-Scale Human Mobility Modeling With Multi-Source Urban Network Data
U-Air: when urban air quality inference meets big data

4.3. Complex Event Recognition in Big Data

4.3.1. Event Pattern Specification languages

ETALIS: Rule-Based Reasoning in Event Processing
Logic-Based Event Recognition
Complex Event Recognition Languages: Tutorial
An Event Calculus for Event Recognition
Cayuga: A High-Performance Event Processing Engine
TESLA: A Formally Defined Event Specification Language
Processing Flows of Information: From Data Stream to Complex Event Processing
ZStream: A Cost-Based Query Processor for Adaptively Detecting Composite Events
High-Performance Complex Event Processing over Hierarchical Data
Distributed Complex Event Processing with Query Rewriting.

4.3.2. Uncertainty Handling in Complex Event Recognition

Probabilistic Complex Event Recognition: A Survey
Introducing Uncertainty in Complex Event Processing: Model, Implementation, and Validation
Dynamic Context-Aware Event Recognition Based on Markov Logic Networks
Event Queries on Correlated Probabilistic Streams
A Probabilistic Logic Programming Event Calculus
Probabilistic Event Calculus for Event Recognition
Event Modeling and Recognition Using Markov Logic Networks

4.3.3. Complex Event Recognition in Big Data Streams

Chronicle Recognition Improvement Using Temporal Focusing and Hierarchization
On Complexity and Optimization of Expensive Queries in Complex Event Processing
Issues in Complex Event Processing: Status and Prospects in the Big Data Era

4.3.4. Machine Learning for Complex Event Recognition

Plan-based complex event detection across distributed sources
Temporal Abstraction and Inductive Logic Programming for Arrhythmia Recognition from Electrocardiograms
Online Learning of Event Definitions
Sequence Clustering-Based Automated Rule Generation for Adaptive Complex Event Processing
Learning from the Past: Automated Rule Generation for Complex Event Processing
Online Structure Learning for Traffic Management
Automatic Learning of Predictive CEP Rules: Bridging the Gap between Data Mining and Complex Event Processing.
Logical and Relational Learning
Statistical Relational Artificial Intelligence: Logic, Probability, and Computation.
Towards Data Mining Without Information on Knowledge Structure

4.3.5. Complex Event Recognition for Mobility Data

Self-Adaptive Event Recognition for Intelligent Transport Management
Complex Event Processing Applied to Early Maritime Threat Detection
A Special Issue on Intelligent Transportation Systems
Online Structure Learning Using Background Knowledge Axiomatization.
Online Event Recognition from Moving Vessel Trajectories.
A Cooperative Approach to Traffic Congestion Detection With Complex Event Processing and VANET
A Complex Event Processing Approach to Perceive the Vehicular Context.
A Complex Event Processing Approach to Detect Abnormal Behaviours in the Marine Environment
Heterogeneous Stream Processing and Crowdsourcing for Urban Traffic Management