"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

May 27, 2023

State of GPT

Brilliant talk on LLMs

  • Emerging Recipe to train

  • Pre-training - compile time of 99%, internet scale dataset
  • Data mixture crawl, high-quality data, mixed up, sampled in proportion

  • Tokenization
  • Text to int representation
  • Similar to embedding, word2vec we discuss

  • Params
  • Token size
  • predict the next integer sequence
  • 1.4 Trillion Tokens

  • hyper parms

  • Pretrain - Tokens to Data batches


  • Probability distribution of what comes next

  • Low loss higher correct probability
  • Learn powerful general representations


  • LLM + Few short learning is practice
  • Transformer forced to multitask in next token
  • Forced to understand text, causes
  • Better than finetune / prompt them


  • Base models are not systems
  • It completes what it fills a document

  • Not very reliable
  • Supervised finetuning
  • Small high-quality datasets
  • Human contractors
  • Prompt-response collection
  • Swapping out the training set
  • QnA - low quantity - high quality
  • People follow the structure and create responses

  • Reward read out tokes
  • Quality of each completion
  • Reformulate loss function with ground truth

  • Reinforcement learning with respect to reward model
  • Reinforce for higher probabilities
  • base model
  • SFT model - supervised fine-tuning (SFT)
  • RM model - reward model (RM) training
  • RL model - reinforcement learning 

  • Model Ranking

  • Applications


  • Template of article

  • Time spent on each token
  • Token simulators
  • Imitate next token
  • Fact based knowledge/parameters
  • Large working memory
  • Transformer direct access to memory
  • Chain of thought
  • Prompt it will revisit


  • Slow and fast reasoning
  • Step by step vs one-step process
  • Tree search algorithm

  • Chain / Agents
  • General techniques
  • The sequence of thought/observation


  • Ask for good performance
  • You are an expert on this Topic
  • In data distribution of sci-fi



  • Tell a prompt not good at
  • Use a calculator, Teach LLM to use tools
  • Retrieval only vs memory only
  • Reteval Augmented models



  • Constrained prompting
  • Forcing templates
  • Output as json


  • base model clamped

  • Finetuning
  • Human contractors





  • Recommendations


  • Use cases


Keep Exploring!!!

No comments: