"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

August 29, 2018

Day #123 - Feature Engineering Tools and Cloud ML Engine

Feature Engineering
  • Longest and most difficult Phase (Preprocessing)
  • Translate raw data into features based on domain knowledge
  • Create good features, synthetic features
  • Reasonable hypothesis for features that matter
  • Different problems in same domain may need different features
Feature Creation
#changing from range 0 to 1
#Re-scaling improves performance of gradient descent

features['price'] = (features['price']-min(price)/(max(price)-min(price))

#category columns
#one hot encoding technique
tf.feature_column.categorical_column_with_vocabulary_list('city',keys=['San Diego','Los Angeles','San Francisco','Sacremento'])

#preprocessing technique
#Provide range of values
features['capped_rooms']=tf.clip_by_value(features['rooms'],clip_value_min=0,clip_value_max=4)

#bucketize columns
lat = tf.feature_column.numeric_column('latitude')
dlat = tf.feature_column.bucketized_column(lat,boundaries=np.arrange(32,42,1).tolist())

Big Query
  • Fully managed DW
  • Compute aggregates
  • Compute Stats
Data beam
  • For streaming data pipeline
  • Time windowed stats
  • Operate on google cloud storage data
Cloud Dataflow
  • Change data from one format to another format
  • BigQuery -> Cloud Storage processed data
  • Python and Java based pipelines
Cloud ML Engine
  • For scaling data we use Cloud ML
  • Train Model
  • Monitor
  • Deploy it as Microservice
  • Batching and Distributed Training
  • Host as Rest API
Tools
  • GCP Console based
  • Use all existing google tools in GCP
  • Specify region, bucket to host code
  • Walkthrough of commands, project stuctue to execute the project in Google ML
Observations (New Learning)
  • Copying data to google cloud
  • Format to specify training / testing data
  • Using Tensorboard
  • Hosting as Rest API
  • Federated Learning
Happy Learning!!!

No comments: