Один момент, який я висловив і який не прозвучав:
- Масштабування поточного пристрою й надалі призведе до покращень. Зокрема, він не затягується.
- Але щось важливе й надалі буде відсутнє.
here are the most important points from today's ilya sutskever podcast:
- superintelligence in 5-20 years
- current scaling will stall hard; we're back to real research
- superintelligence = super-fast continual learner, not finished oracle
- models generalize 100x worse than humans, the biggest AGI blocker
- need completely new ML paradigm (i have ideas, can't share rn)
- AI impact will hit hard, but only after economic diffusion
- breakthroughs historically needed almost no compute
- SSI has enough focused research compute to win
- current RL already eats more compute than pre-training
New Anthropic research: Natural emergent misalignment from reward hacking in production RL.
“Reward hacking” is where models learn to cheat on tasks they’re given during training.
Our new study finds that the consequences of reward hacking, if unmitigated, can be very serious.