Data science applications to connected vehicles
Key Notes
- Data generated by sensors and actuators in Connected vehicles include noisy, anomalous, redundant, rapidly changing, correlated and heterogeneous data.
Main findings
- Multitude of formats and data types
- Data in Connected vehicles are generated and collected at high speed
Applications
- Mobility
- Understand patterns and trends in mobility data
- Predicting traffic flow
- Provide shortest or alternative routes
Safety
- Driver behaviour and performance analysis
- Infer real-time environmental conditions
- Lane-changing assistance
- Understand interactions between drivers and pedestrian at signalized intersections
Support
- Guidance to parking spaces - Driver behaviour analysis (e.g., in the insurance domain, for calculating a safety score for the driver: pay-how-you-drive instead of insurance premiums based on population groups)
- Vehicle predictive maintenance.
Connected vehicle data
- 560 GB/day
- Data generated in CVs exhibit either temporal correlation, spatial correlation or both
- A stream is a sequence of data elements ordered by time
- Discrete signals, event logs, or any combination of time series data
- Drift is more associated to gradual changes in the target concept
- Sensory data stream
- Spatial, temporal, and spatio-temporal attributes
- Existence of missing data (absent readings).
- Real-time data cleaning
- Knowledge discovery from data streams
- Data windows are a way of looking at relevant slices of a data stream.
- Windowing models landmark, tilted, sliding and damped windows
Landmark Window identifies relevant points (the landmark) in the data stream and the aggregate operator uses all records seen so far after the landmark.
Damped Window Model. This model assigns greater weight to more recently arrived transactions.
Data pre-processing
- Noise filtration
- Outliers detection
- Anomaly detection
- Feature extraction
- Sparsity handling
Knowledge management.
- On-device
- On-edge
- Remote
Algorithms for clustering data streams
- Stream and CluStream algorithms
State-of-the-art on clustering data streams
- CluStream [1], DenStream [2], StreamKM++ [3], or ClusTree
- DenStream [2] is an extension of DBSCAN algorithm
- StreamKM++ [3] of k-means++, StrAP [4] of AP
On a record-at-a-time processing model, long-running stateful operators process records as they arrive, update the internal state, and send out new records
Micro-batching processing model runs each streaming computation as a series of deterministic batch computations on small time intervals
CluStream - The idea behind the CluStream [1] method is to divide the clustering process into an online component which periodically stores detailed summary statistics and an offline component which uses only this summary statistics.
StreamKM++ [3] is a two-phase (online-offline) algorithm which maintains a small outline of the input data using the merge-and-reduce technique.
StrAP [4] is an extension of the Affinity Propagation (AP) [44] algorithm for data streams, which uses a reservoir for saving potential outliers
DenStream [2] is a density-based data stream clustering algorithm that also uses a feature vector based on the CF vector.
SOStream [50] is a density-based clustering algorithm inspired by both the principle of the DBSCAN algorithm and self-organizing maps (SOM)
More Reads
- A Clustering-based Framework for Classifying Data Streams
- Memory Efficient Experience Replay for Streaming Learning
- Data Stream Clustering: Challenges and Issues
- Event-Driven News Stream Clustering using Entity-Aware Contextual Embeddings
- A benchmark of data stream classification for human activity recognition on connected objects
- Online Machine Learning in Big Data Streams
- Contextual One-Class Classification in Data Streams
- Data Stream Clustering: A Review
- A Survey of Autonomous Driving: Common Practices and Emerging Technologies
- Machine Learning for Internet of Things Data Analysis: A Survey
- Data Stream Clustering: A Survey
- A Survey Paper on Data Stream Mining
Keep Thinking!!!
No comments:
Post a Comment