Brilliant talk on LLMs
- Emerging Recipe to train
- Pre-training - compile time of 99%, internet scale dataset
- Data mixture crawl, high-quality data, mixed up, sampled in proportion
- Tokenization
- Text to int representation
- Similar to embedding, word2vec we discuss
- Params
- Token size
- predict the next integer sequence
- 1.4 Trillion Tokens
- hyper parms
- Pretrain - Tokens to Data batches
- Probability distribution of what comes next
- Low loss higher correct probability
- Learn powerful general representations
- LLM + Few short learning is practice
- Transformer forced to multitask in next token
- Forced to understand text, causes
- Better than finetune / prompt them
- Base models are not systems
- It completes what it fills a document
- Not very reliable
- Supervised finetuning
- Small high-quality datasets
- Human contractors
- Prompt-response collection
- Swapping out the training set
- QnA - low quantity - high quality
- People follow the structure and create responses
- Reward read out tokes
- Quality of each completion
- Reformulate loss function with ground truth
- Reinforcement learning with respect to reward model
- Reinforce for higher probabilities
- base model
- SFT model - supervised fine-tuning (SFT)
- RM model - reward model (RM) training
- RL model - reinforcement learning
- Model Ranking
- Applications
- Template of article
- Time spent on each token
- Token simulators
- Imitate next token
- Fact based knowledge/parameters
- Large working memory
- Transformer direct access to memory
- Chain of thought
- Prompt it will revisit
- Slow and fast reasoning
- Step by step vs one-step process
- Tree search algorithm
- Chain / Agents
- General techniques
- The sequence of thought/observation
- Ask for good performance
- You are an expert on this Topic
- In data distribution of sci-fi
- Tell a prompt not good at
- Use a calculator, Teach LLM to use tools
- Retrieval only vs memory only
- Reteval Augmented models
- Constrained prompting
- Forcing templates
- Output as json
- base model clamped
- Finetuning
- Human contractors
- Recommendations
- Use cases
Keep Exploring!!!
No comments:
Post a Comment