Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters
Dylan Patel
And pre training is where there is a lot more of complexity in terms of how the process is emerging or evolving and the different types of training losses that you will use. I think this is a lot of techniques grounded in the natural language processing literature. The oldest technique which is still used today is something called instruction tuning or also known as supervised fine tuning.
0
💬
0
Comments
Log in to comment.
There are no comments yet.