Tool Use Co-Host
Appearances
The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis
NLW on the Future of AI Agents
Absolutely. I've kind of tried to teach people to view it as like Wikipedia. Use it to get started, but it's not something you can put as a reference in your paper. In regards to the hallucinations, a lot of people try to solve this with evals or just build enough of a robust eval set that they're able to kind of mitigate against some of the risks of hallucinations.
The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis
NLW on the Future of AI Agents
Do you find businesses are implementing any other types of strategies or are they even following through with the evals or just kind of yellowing it? What's the vibe in the business community?
The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis
NLW on the Future of AI Agents
I wouldn't mind diving into the security aspect a little bit. We're familiar with this one project, a code gate, which kind of acts as a local proxy that your LLM requests route through so it can redact PII and stuff like that. But it just seems to be just getting started. Do you have any either tools or advice for companies that are concerned about security and bringing LLMs into their workflow?
The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis
NLW on the Future of AI Agents
This week, we're joined by Nathaniel Whittlemore, also known as NLW, the founder and CEO of Superintelligent, as well as the host of my favorite daily AI podcast, the AI Daily Brief. NLW, welcome to Tool Use.
The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis
NLW on the Future of AI Agents
Yeah, I found it something similar to where people say, oh, the newest agent from OpenAI deep research, which I've used and is great. And other people say like, well, what about code interpreter? Is that an agent? And ultimately it doesn't matter whether it's a tool or a workflow, as long as it solves a certain task for you. Through your use of them, what type of use cases are you excited for?
The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis
NLW on the Future of AI Agents
What have you found to be actually helpful in the current state?
The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis
NLW on the Future of AI Agents
Yeah, absolutely. And I've even seen the progression where you have those chats with Cloud or ChatGPT to get some input, help with the brainstorming, coming up with titles, to creating a Cloud project when you can upload a bunch of documents, a bunch of standards and best practices, so you can get more consistent results over time.
The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis
NLW on the Future of AI Agents
We've also experimented with the AI editors and we've yet to find success there. But it's interesting how the chasm between what works today and what is, you know, not quite working, what's a little ways off is just shrinking by the day.
The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis
NLW on the Future of AI Agents
Have you noticed any tools in your workflow that have really allowed you to completely offset a process or are you still a human in the loop a lot of the time for these type of things?
The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis
NLW on the Future of AI Agents
Yeah, absolutely. And as a longtime listener, I can tell you that the added personality, the added perspective always helps besides just, you know, an information dump. I actually wouldn't mind double click on deep research because I've also used it, had positive results. But as you mentioned, the Twitter vibe test, a lot of people didn't seem to like it.
The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis
NLW on the Future of AI Agents
A lot of people did, but it was one of those right down the middle ones. What's your experience been like with it? Do you think it's a step in the right direction? And even just like long running AI processes in general? Do you think that's the future?