Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database): Research paper reads connected cars

August 05, 2021

Research paper reads connected cars

Data science applications to connected vehicles

Key Notes

Data generated by sensors and actuators in Connected vehicles include noisy, anomalous, redundant, rapidly changing, correlated and heterogeneous data.

Main findings

Multitude of formats and data types
Data in Connected vehicles are generated and collected at high speed

Applications

Mobility
Understand patterns and trends in mobility data
Predicting traffic flow
Provide shortest or alternative routes

Safety

Driver behaviour and performance analysis
Infer real-time environmental conditions
Lane-changing assistance
Understand interactions between drivers and pedestrian at signalized intersections

Support

Guidance to parking spaces - Driver behaviour analysis (e.g., in the insurance domain, for calculating a safety score for the driver: pay-how-you-drive instead of insurance premiums based on population groups)
Vehicle predictive maintenance.

Connected vehicle data

560 GB/day
Data generated in CVs exhibit either temporal correlation, spatial correlation or both
A stream is a sequence of data elements ordered by time
Discrete signals, event logs, or any combination of time series data

Drift is more associated to gradual changes in the target concept

Sensory data stream
Spatial, temporal, and spatio-temporal attributes
Existence of missing data (absent readings).
Real-time data cleaning
Knowledge discovery from data streams
Data windows are a way of looking at relevant slices of a data stream.
Windowing models landmark, tilted, sliding and damped windows

Anomaly Detection with these properties - For every data type we might need to look data properties with respect to time - Recurring patterns, gradual increase, sudden increase, Lows and Highs.

In stock market they do this in terms on candle stick patterns, looking for patterns in duration of 3months, 6months and see if something demonstrates. Anomaly is subjective to use case but properties of data (Sudden, incremental, Gradual, Recurring) about it will spot anomaly comparing historical vs current observations.

Sliding Window. Given a window with width w and current time point t, the interest is in the frequent patterns occurring in the window [t − w + 1, t].

Landmark Window identifies relevant points (the landmark) in the data stream and the aggregate operator uses all records seen so far after the landmark.

Damped Window Model. This model assigns greater weight to more recently arrived transactions.

Data pre-processing

Noise filtration
Outliers detection
Anomaly detection
Feature extraction
Sparsity handling

Knowledge management.

On-device
On-edge
Remote

Algorithms for clustering data streams

Stream and CluStream algorithms

State-of-the-art on clustering data streams

CluStream [1], DenStream [2], StreamKM++ [3], or ClusTree
DenStream [2] is an extension of DBSCAN algorithm
StreamKM++ [3] of k-means++, StrAP [4] of AP

On a record-at-a-time processing model, long-running stateful operators process records as they arrive, update the internal state, and send out new records

Micro-batching processing model runs each streaming computation as a series of deterministic batch computations on small time intervals

CluStream - The idea behind the CluStream [1] method is to divide the clustering process into an online component which periodically stores detailed summary statistics and an offline component which uses only this summary statistics.

StreamKM++ [3] is a two-phase (online-offline) algorithm which maintains a small outline of the input data using the merge-and-reduce technique.

StrAP [4] is an extension of the Affinity Propagation (AP) [44] algorithm for data streams, which uses a reservoir for saving potential outliers

DenStream [2] is a density-based data stream clustering algorithm that also uses a feature vector based on the CF vector.

SOStream [50] is a density-based clustering algorithm inspired by both the principle of the DBSCAN algorithm and self-organizing maps (SOM)

More Reads

Keep Thinking!!!

Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database)

August 05, 2021

Research paper reads connected cars

No comments:

Git Code Repository

About Me

What is your Expertise

Search This Blog

Translate

About Me and Disclaimer

Labels

Data Science Good Reads

Cloud, Datacentre, BigData and NOSQL Blogs

SQL Links

Archecture Blog List

Programming Problems

Startup - Reads

Perl-Python-Ruby-Linux-Oracle

Management + Leadership Blogs

Research Papers & Podcasts

My Wordpress

Interesting Reads

Useful Links - C# and .NET

Java, Selenium, QTP and Test Tools Learning

Agile Testing

Reverse Logistics Reads

Biztalk Blogs

MS BI Links

Process - Learnt it :)

Usability Guidelines - Building Better Sites

.NET Test Tools and Other Interesting Reads

Review Checklist

Blog Archive

Live Traffic

Total Pageviews

Popular Posts