Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database): LLM Discussions

January 01, 2024

LLM Discussions - Good Read

Unfortunately , too few people understand the distinction between memorization and understanding. It's not some lofty question like "does the system have an internal world model?", it's a very pragmatic behavior distinction: "is the system capable of broad generalization, or is… https://t.co/1fagV1YI15
— François Chollet (@fchollet) December 15, 2023

Future LLMs will be a different breed than current LLMs, so what's true for current LLMs might not apply in 1 year:

- Self-play: Allows these models to learn beyond human data, like AlphaGo

- Bootstrap: if validation is easier than generation, the models can bootstrap by…
— Will Bryk (@WilliamBryk) December 24, 2023

Unfortunately , too few people understand the distinction between memorization and understanding. It's not some lofty question like "does the system have an internal world model?", it's a very pragmatic behavior distinction: "is the system capable of broad generalization, or is… https://t.co/1fagV1YI15
— François Chollet (@fchollet) December 15, 2023

Limits of Transformers on Compositionality

First, transformers solve compositional tasks by reducing multi-step compositional reasoning into linearized path matching.
This contrasts with the systematic multi-step reasoning approach that learns to apply underlying computational rules required for building correct answers [71, 37, 27].
Shortcut learning [29] via pattern-matching may yield fast correct answers when similar compositional patterns are available during training but does not allow for robust generalization to uncommon or complex examples.
Second, due to error propagation, transformers may have inherent limitations on solving high-complexity compositional tasks that exhibit novel patterns. Errors in the early stages of the computational process can lead to substantial compounding errors in subsequent steps, preventing models from finding correct solutions.

Embers of Autoregression: Understanding Large Language Models Through the Problem They are Trained to Solve

Humans and large language models (LLMs) have some shared properties and some properties that differ. If LLMs are analyzed using tests designed for humans, we risk identifying only the shared properties, missing the properties that are unique to LLMs (the dotted region of the diagram). We argue that to identify the properties in the dotted region we must approach
LLMs on their own terms by considering the problem that they were trained to solve: next-word prediction over Internet text.

On the Measure of Intelligence

Describing intelligence as skill-acquisition efficiency and highlighting the concepts of scope, generalization difficulty, priors, and experience, as critical pieces to be accounted for in characterizing intelligent systems.
Intelligence as a collection of task-specific skills
Intelligence as a general learning ability
Skill-based, narrow AI evaluation
The spectrum of generalization: robustness, flexibility, generality
System-centric generalization: this is the ability of a learning system to handle situations it has not itself encountered before.
Developer-aware generalization: this is the ability of a system, either learning or static, to handle situations that neither the system nor the developer of the system have encountered before.
Local generalization, or “robustness”: This is the ability of a system to handle new points from a known distribution for a single task or a well-scoped set of known tasks
Broad generalization, or “flexibility”: This is the ability of a system to handle a broad category of tasks and environments without further human intervention

Keep Exploring!!!

Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database)

January 01, 2024

LLM Discussions - Good Read

No comments:

Git Code Repository

About Me

What is your Expertise

Search This Blog

Translate

About Me and Disclaimer

Labels

Data Science Good Reads

Cloud, Datacentre, BigData and NOSQL Blogs

SQL Links

Archecture Blog List

Programming Problems

Startup - Reads

Perl-Python-Ruby-Linux-Oracle

Management + Leadership Blogs

Research Papers & Podcasts

My Wordpress

Interesting Reads

Useful Links - C# and .NET

Java, Selenium, QTP and Test Tools Learning

Agile Testing

Reverse Logistics Reads

Biztalk Blogs

MS BI Links

Process - Learnt it :)

Usability Guidelines - Building Better Sites

.NET Test Tools and Other Interesting Reads

Review Checklist

Blog Archive

Live Traffic

Total Pageviews

Popular Posts