Menu
Sign In Pricing Add Podcast
Podcast Image

Oxide and Friends

Unshrouding Turin (or Benvenuto a Torino)

Wed, 16 Oct 2024

Description

George Cozma of Chips and Cheese joined Bryan, Adam, and the Oxide Friends to talk about AMD's new 5th generation EPYC processor, codename: Turin. What's new in Turin and how is Oxide's Turin-based platform coming along?In addition to Bryan Cantrill and Adam Leventhal, we were joined by special guest George Cozma, as well as Oxide colleagues Robert Mustacchi, Eric Aasen, Nathanael Huffman, and the quietly observant Aaron Hartwig.Some of the topics we hit on, in the order that we hit them:Chips and Cheese: AMD's Turin: 5th Gen EPYC LaunchedEnd of the Road: An Anandtech FarewellCentaur TechnologyAVX-512Zen5's AVX512 Teardown + More...Thermal Power Design (TDP)OxF: Rack Scale Networking (use of p4)P4AGESAOxF: The Network Behind the Network (Oxide server recovery)openSILphoronix: openSILPCB backdrillingOxF: AMD's MI300 (APUs)dtrace.conf(24) -- The DTrace unconference, December 11th, 2024If we got something wrong or missed something, please file a PR! Our next show will likely be on Monday at 5p Pacific Time on our Discord server; stay tuned to our Mastodon feeds for details, or subscribe to this calendar. We'd love to have you join us, as we always love to hear from new speakers!

Audio
Transcription

0.049 - 6.451 Bryan Cantrill

I noticed that you changed up the title. You did not like my... Okay, yeah, go ahead. Go ahead, explain yourself.

0
💬 0

6.471 - 10.152 Aaron Hartwig

Here's where I wanted to start. I love your title. Okay.

0
💬 0

11.453 - 17.795 Bryan Cantrill

I mean, why would someone... Obviously, I love my title. I love your title. Akui, there's a butt coming.

0
💬 0

17.975 - 19.335 Aaron Hartwig

There's not even a butt coming.

0
💬 0

19.715 - 23.897 Bryan Cantrill

Listen, pal, we've been rejected by enough VCs. I know a breakup male when I see it.

0
💬 0

25.066 - 28.367 Aaron Hartwig

No, like unequivocally, I love it. Period. The end.

0
💬 0

28.647 - 32.868 Bryan Cantrill

We love meeting Oxide. We're very excited about Oxide. Okay. Yeah. Go on.

0
💬 0

33.428 - 38.309 Aaron Hartwig

Look, I'm not cheering from the sidelines. I want to be in the game with you. I'm in the game.

0
💬 0

39.049 - 43.71 Bryan Cantrill

Okay. Go on. So you love my title. My title was Unshrouding Turin.

0
💬 0

44.77 - 45.71 Aaron Hartwig

Yeah. Love it.

0
💬 0

46.35 - 50.831 Bryan Cantrill

But I noticed the title here is Benvenuto a Torino.

0
💬 0

51.911 - 52.592 Aaron Hartwig

That was for Robert.

0
💬 0

53.112 - 53.712 Bryan Cantrill

That was for Robert.

0
💬 0

53.732 - 57.392 Aaron Hartwig

I thought he'd enjoy that more. Robert, which one do you prefer?

0
💬 0

57.753 - 85.303 Bryan Cantrill

Oh, put Robert on the spot. Go ahead, Robert. There we go. What an all pro. You know what this reminds me of? This reminds me of when my now 20-year-old was four. We understood from one of his friends in the neighborhood that he and this girl were going to get married. And they were like, okay, that seems like a little bit heavy for four. Yeah.

0
💬 0

86.224 - 106.76 Bryan Cantrill

And we were talking to another parent at the preschool, and she was saying that her daughter and Tobin were going to get married. I'm like, God, this kid's a real, gets around, real gigolo here. Well, as long as, you know, he's got his, I guess he's got, you know, when you're a four-year-old, I guess you have a playmate in every port. And we are at the beach.

0
💬 0

106.84 - 113.345 Bryan Cantrill

We're at Chrissy Field with one of these girls, and the other girl comes up.

0
💬 0

114.666 - 114.907 Aaron Hartwig

Oh, man.

0
💬 0

115.652 - 132.628 Bryan Cantrill

Oh, and you're like, okay, what? And I'm reminded a little bit of Robert on the spot over here. And like Robert, my four-year-old takes the hand of one of the girls, takes the hand of the other girl, and then the three of them all go running off together.

0
💬 0

132.648 - 134.15 Aaron Hartwig

Delightful.

0
💬 0

134.41 - 136.632 Bryan Cantrill

I'm like, all right, you know, go for it.

0
💬 0

138.294 - 140.616 Aaron Hartwig

You're like, I'm just going to write this down to tell at your wedding.

0
💬 0

141.654 - 165.133 Bryan Cantrill

Absolutely, or weddings. I mean, who's to say that this, you know, who's to say that this won't carry into adulthood? Yeah, so I'm raising a bigamist. Anyway, I am, regardless of the title and who it was designed to appease, I am very, we're very excited to be talking about Turin and the Turin launch. This is AMD's latest part. George, thank you very much for joining us. Really appreciate it.

0
💬 0

165.553 - 168.896 Bryan Cantrill

You are a repeat friend of Oxide. It's good to have you back.

0
💬 0

170.08 - 173.503 George Cozma

Good to be back. Excited to be here.

0
💬 0

173.563 - 184.613 Bryan Cantrill

So you had a great blog entry that I was excited to see it at the top of Hacker News over the weekend. Were you surprised by that?

0
💬 0

186.875 - 213.462 George Cozma

No. I've noticed that Hacker News has been picking us up more and more. That's great. And... Ironically enough, so we recently moved over to Substack and we noticed that Seemingly, the SEO for Substack is a lot better. So that article got a lot more traction.

0
💬 0

213.482 - 214.883 Nathanael Huffman

Oh, interesting.

0
💬 0

215.904 - 221.347 George Cozma

So it being at the top of multiple sites, aggregator sites, doesn't surprise me.

0
💬 0

221.788 - 226.03 Bryan Cantrill

You know, maybe the SEO, it may just be also that you just got a great article on a hot topic.

0
💬 0

226.07 - 230.153 George Cozma

You know, this could be... Yeah, and what's funny was the video actually did really well.

0
💬 0

230.413 - 232.014 Bryan Cantrill

So, okay, I'm glad you brought up the video.

0
💬 0

232.034 - 237.478 George Cozma

Yeah. Oh, so there was the video part of the article.

0
💬 0

237.658 - 246.742 Bryan Cantrill

No, no, I'm glad you brought it up. Because those, you, the comments on that video were the nicest YouTube comments I've ever seen in my life.

0
💬 0

247.922 - 249.343 Robert Mustacchi

Yeah.

0
💬 0

250.383 - 252.464 Bryan Cantrill

I didn't even know YouTube comments could be nice.

0
💬 0

253.364 - 254.225 Robert Mustacchi

Yeah.

0
💬 0

255.225 - 264.824 George Cozma

And most... What I've noticed is that, so you guys can't see the like ratio, but it currently has a 100% like ratio.

0
💬 0

266.52 - 270.024 Bryan Cantrill

I mean, what is going on? This is, this can't be a YouTube video.

0
💬 0

270.184 - 271.205 George Cozma

It's not YouTube.

0
💬 0

271.325 - 285.258 Bryan Cantrill

It's not YouTube. There's something, this thing has fallen into some alternate reality and like, and the comments are all like, you know, thanks for all of your diligent work. And you know, I just, I, I love, I mean, it just, it's great. God, like we talk about lightning in a bottle.

0
💬 0

285.278 - 297.991 George Cozma

I don't know if you saw these Adam, but it was just like, I have, I think a lot of it is because like, I mean, we, what, last month or just the month before Anand Tech closed?

0
💬 0

299.792 - 300.072 Bryan Cantrill

Yes.

0
💬 0

300.792 - 315.098 George Cozma

For the last time, a lot of the older folks like Ace Hardware Review, Real World Tech, David Cantor doesn't really write anymore. So a lot of the in-depth stuff has sort of disappeared over time.

0
💬 0

316.127 - 323.436 Bryan Cantrill

Yeah, so you think that this has given the internet some gratitude? You've managed to domesticate the internet.

0
💬 0

326.239 - 335.425 George Cozma

More so that I think people want an alternative to... That really goes in depth.

0
💬 0

335.785 - 358.591 Bryan Cantrill

And look, it was like if there was a YouTube video that we're going to start having nice comments on, that was a good one to start on. That was a great video, went in depth. I love that you kind of had the surprise ending where you set a world record in your hotel room. Let's start with there. What was that benchmark that you were running? And you were running that on Turin, obviously.

0
💬 0

359.451 - 395.717 George Cozma

So that was Y Cruncher. a hundred billion, uh, BPT, uh, B, B, B, B, B, excuse me. It's hard to say. Um, but basically it, it, All it does is it's a compute benchmark. So it just wants as many threads and as high clocks as you can get. It's not memory bound at all. But the prior record at the time of that video was about 10 seconds. And that was a sub five second result.

0
💬 0

396.317 - 411.84 George Cozma

And I see Jordan in the chat or as one of the audience members, he was running it with someone else in the room, Jeff from Graph Computing. And I was doing my video and I see him in my eye trying to wave this laptop.

0
💬 0

411.88 - 423.307 Robert Mustacchi

I'm like, I couldn't really say anything. I was like, what's going on? And he's like, do you want to show the audience that we just broke a record? I'm like, okay, completely unplanned. I had no idea that was going to happen.

0
💬 0

423.327 - 430.951 Bryan Cantrill

It was pretty great. It was pretty great. And I love that you're like kind of wrapping it up. You're like, no, no, actually, wait a minute. Hold on. I'm not being, this laptop just being handed to me.

0
💬 0

431.571 - 433.812 Robert Mustacchi

Wait, there's literally, I pulled the Lisa Sue. Wait, there's more.

0
💬 0

436.874 - 456.394 Bryan Cantrill

There's more. Yeah. Okay. So, and obviously that setting, and it's helpful to know this is a very compute intensive workload. Because one of the things that I think that we've heard from a bunch of folks is this thing is so much more compute that now you've got to really ask questions about balance of the system and memory bandwidth and so on. So I want to get into all that.

0
💬 0

457.165 - 475.052 Bryan Cantrill

Um, I guess one thing I would kind of ask you from the top, um, just what is your kind of top takeaway from the Turin launch? And was there anything that surprised you? Was there anything that you either didn't know what was coming or didn't know the kind of magnitude or you're still, yeah.

0
💬 0

476.033 - 486.951 George Cozma

Sort of three things here, if you don't mind. Um, What surprised me most about Turrent specifically was one, the fact that they hit five gigahertz on some SKUs.

0
💬 0

487.011 - 487.691 Bryan Cantrill

Yeah, I was going to ask you that.

0
💬 0

487.711 - 507.002 George Cozma

On a server CPU. Yeah. But not just hit five gigahertz. Like I wrote in the article, Wendell from Level One Text in a essentially web server workload was hitting 4.9 gigahertz all core. That's nutty. That's utterly, utterly nutty.

0
💬 0

508.991 - 518.853 Bryan Cantrill

That is crazy. I feel like the last time we were really, it's been a minute since we've seen clocks that high from anywhere, I feel. I feel it's been like, I mean, IBM was hitting it with power.

0
💬 0

518.873 - 552.383 George Cozma

Well, no, IBM C is the only, or really the only folks that do that sort of over five gigahertz consistently in server. Intel back in the day, if you remember those Black Ops CPUs, they were doing five gigahertz. And I believe that there was one, the last Oracle, Spark CPU, the M8, went up to five years. Did it really? But other than that, I'm having to draw upon some fairly niche CPUs here.

0
💬 0

552.563 - 578.851 George Cozma

The fact that what is effectively a mainstream CPU can do this is crazy. But add on to that, just... The fact that Zen 5C, so sort of those compact cores, have the full 512-bit FPU, I think that that's really impressive considering that they're sticking 16 of them into a single CCD now.

0
💬 0

580.764 - 602.679 Bryan Cantrill

Yeah, so let's elaborate on this a little bit because this is, I think, a really interesting point. So we're seeing this from Intel too, right, where once you get the density up to a certain level, you've got to make some compromises. But the compromises that AMD seems to be making are much less than the compromises you're seeing. I mean, the Zen 5C cores are – they're still Zen 5 cores.

0
💬 0

603.519 - 628.706 George Cozma

Yeah, so here's the difference between Zen 5 and Zen 5C. From an architecture perspective, nothing. Until you hit L3, there's no difference. Now, are they on different nodes? Yes. Is there an Fmax difference? Yes. But the fact that they're still hitting 3.7...

0
💬 0

631.377 - 659.38 George Cozma

f max so that's your top clock is really impressive and the no difference is i think you're on what three nanometer for the 5c and four nanometer for the yes and five is that right yes these are all tsmc obviously so on their c cores they jumped uh 600 megahertz so from 3.1 to 3.7 um god that is amazing yeah yeah and and i there was a good point made and i think the biggest jump

0
💬 0

660.772 - 670.477 George Cozma

in recent AMD history in terms of the server performance. It's not Zen 4 to Zen 5. It's Zen 4C to Zen 5C.

0
💬 0

670.917 - 671.358 Bryan Cantrill

Interesting.

0
💬 0

671.478 - 695.525 George Cozma

I think that performance jump is far from... Like, I think that that's really, really exciting performance. But what I really like about Turret is not just that there's these big old high-end SKUs, the 128 and 192 core SKUs, but they've paid attention to this mid-range, right?

0
💬 0

695.986 - 722.744 George Cozma

With the 9575F, that's a high-frequency 64-core SKU that I was talking about that was getting 4.9 from Windows testing on all cores. I think that that sort of skew, the fact that they still are paying attention to sort of the mid-range is really good. It's expensive, no doubt, but... Yeah.

0
💬 0

723.905 - 744.855 Bryan Cantrill

And the fact that you've got this kind of SKU stack now that is kind of a uniform SKU stack where you can start to really make some interesting trade-offs as you look at workload. And it just, it feels like, you know, and Robert here is our resident SKU stackologist. Yeah.

0
💬 0

745.595 - 764.103 Bryan Cantrill

which definitely, I mean, there were times with Intel where you've got like these three different, where you had like gold, silver, and platinum, and you've got- Also bronze. Bronze, right. And it just, it did require like you to get a postdoc to figure out like which part you want. And Robert, what do you make of this Turin SKU stack?

0
💬 0

764.143 - 769.245 Bryan Cantrill

I mean, it seems like it's a pretty clean SKU stack in terms of making different trade-offs as you go up and down it.

0
💬 0

769.685 - 786.223 Adam Leventhal

Yeah, I think AMD's biggest strength is the fact that you're basically only choosing the number of cores, what the frequency is, and cache size. Otherwise, all the other features are the same. And I think that actually ends up being pretty powerful. You're not getting into a case where it's like, oh, do you want to have fast memory? Different SKU. Do you want to have a different...

0
💬 0

787.665 - 792.209 Adam Leventhal

Do you want to have RAS features? Ooh, sorry. That's going to cost you. That's going to cost you.

0
💬 0

792.629 - 800.295 Aaron Hartwig

So you're not having to make trade-offs. You kind of, you kind of take from the buffet table, but you're not needing to compromise. Is that right?

0
💬 0

800.575 - 816.768 Bryan Cantrill

Yeah. I mean, certainly. And I think that, that, you know, when on the Intel side, when you go from these P cores to E cores, Adam, you're going from the P is for performance and the E is for efficiency. You're like, Oh, as, uh, uh,

0
💬 0

818.632 - 830.941 George Cozma

So, Joe McCurry from AMD put it, economy cores. Economy cores. When he said that on stage, Ian and I were laughing for about five minutes straight.

0
💬 0

831.322 - 831.982 Robert Mustacchi

That's really good.

0
💬 0

832.222 - 835.304 Bryan Cantrill

Yeah, because in particular, you don't have AVX-512 on those.

0
💬 0

836.185 - 845.372 Adam Leventhal

No. That's the most shocking bit to me, is that you basically still never, despite Intel championing it and trying to put all that energy into everything over the years, you still just got nothing.

0
💬 0

846.432 - 855.889 George Cozma

And I... let's just say that's, that's something I've harped on Intel about is they really, the fact that they now have an ISA segmentation.

0
💬 0

856.189 - 857.37 Bryan Cantrill

Yes. Yeah.

0
💬 0

857.49 - 866.535 George Cozma

It's bad. It's bad. Don't segment your ISA, please. Yeah. Like that's, that's how you shoot yourself in the foot. Something fierce.

0
💬 0

867.316 - 881.403 Aaron Hartwig

Just to elaborate on that. So what that means is like the operating system needs to make scheduling decisions that, within the same chip about whether certain workloads can run on certain of the cores.

0
💬 0

881.423 - 896.029 George Cozma

Is that right? So on the client side, yes, that's true. On the server side, that's not correct. All server chips have the same cores. So there are two Xeon 6 lineups, the Xeon 6000Ps and the Xeon 6000Bs.

0
💬 0

900.179 - 926.27 George Cozma

and the p's are for the p cores there's no merger of yeah so just to be clear it's like it's like ryanair it's like all economy yes on this one but but the thing with that is and here's sort of the cleverness of zen 5 versus n5c and that is amd can just have one big skew stack and that's it yeah intel you need these two different skew stacks

0
💬 0

927.972 - 931.597 George Cozma

And it can start getting confusing with what has what and where.

0
💬 0

932.277 - 953.98 Bryan Cantrill

Well, and then it's like, so it's really tough too, because if you are, you know, in our position of like, we are selling a rack scale computer to someone who is making compute available to their kind of customers, right? We're asking, if you have to ask them, well, are you using AVX 512? They'll be like, I don't know. I have to go ask my users that.

0
💬 0

954.7 - 961.784 Bryan Cantrill

And it makes it really hard, and you're kind of at this big fork in the road. So to be able to have a...

0
💬 0

965.225 - 985.431 Bryan Cantrill

to not have to give up i mean just as you said george to not have to compromise on isa and to get the same isa everywhere and yes you may be giving up you know you're making some trade-offs in terms of max frequency and so on but you're not like dropping off your trade-off is an area right right yeah you have to spend that area um but the

0
💬 0

987.463 - 1022.23 George Cozma

The thing here is I think the area would be, even if it was a bare bones implementation, I don't know if you guys remember Centaur, Centaur CNS. That was, if you guys remember VIA, that was a VIA CPU that never actually hit market, but you were able to test Um, thanks to a couple of guys who acquired some during the Centaur, uh, buyout days about three years ago. Um.

0
💬 0

1025.183 - 1030.888 George Cozma

It had a very basic and bare bones AVX 512 implementation. But that's fine.

0
💬 0

1031.229 - 1039.336 Bryan Cantrill

I feel like the last time you were on, I was getting a bunch of grief for dropping some dated references. This is an old part, just to be clear.

0
💬 0

1039.676 - 1059.726 George Cozma

Well, here's the thing, right? It's an old part, absolutely. But Intel bought the team and It was essentially an AccuHire. They got the folks from Centaur three years ago. Oh, so this is a recent thing.

0
💬 0

1059.766 - 1061.768 Bryan Cantrill

This is not Centaur.

0
💬 0

1061.808 - 1073.836 George Cozma

No, no, no. So Centaur was broken up essentially in 2021, as the Wikipedia article says. VIA still has x86 licensing, but the Centaur team isn't at VIA anymore.

0
💬 0

1074.676 - 1084.641 Bryan Cantrill

So this was in 2021. So I assumed you were talking about the, like the, okay, this is, you're not talking about a chip that was made in like 1999. No, no, no, no, no.

0
💬 0

1084.721 - 1089.804 George Cozma

I'm talking about a chip that was due for release in 2022.

0
💬 0

1091.404 - 1092.485 Bryan Cantrill

And they were broken.

0
💬 0

1092.505 - 1094.086 George Cozma

But was canceled just before launch.

0
💬 0

1094.926 - 1104.171 Bryan Cantrill

Okay, so I've got so many questions about this Wikipedia page. So they were broken up. First of all, like nice use of the passive voice. Like they were, like what broke them up?

0
💬 0

1105.359 - 1131.798 George Cozma

The iceberg. The iceberg broke the line. So VIA, who was the parent company, basically shuttered the Austin headquarters. And the team was acquired by Intel for $125 million. That's a hell of an acquirer. Yeah. Here's the thing.

0
💬 0

1134.195 - 1155.29 George Cozma

was expecting it and it was very um the amount of information that has been shared about it has been close to zero interesting and if you ask anyone that formerly was there they they don't say anything it's it's kind of weird okay interesting and so then in via i didn't realize that via was an x86 license holder

0
💬 0

1156.77 - 1183.872 George Cozma

So they had licenses from IDT, which is what Centaur used to be, and Cyrex, because they acquired Cyrex. Right. So they had licensing, and I think they still technically do, which is how Zhaoxin, which is a Chinese x86 manufacturer, has the ability to make x86 CPUs, which that's a whole history in and of itself.

0
💬 0

1186.948 - 1198.54 Bryan Cantrill

God, thank you for opening up all of these doors. Did you see this? There's this documentary, The Rise of the Centaur, covering the early history of the company. It's like, okay, that's must-see TV. I mean...

0
💬 0

1202.817 - 1215.521 George Cozma

But the reason why I brought them up is because what was supposed to be their newest course, CNS, was supposed to have had ABX 512 capabilities, but it was a very bare-bones implementation of ABX 512.

0
💬 0

1215.641 - 1217.242 Bryan Cantrill

Oh, interesting. Okay.

0
💬 0

1217.702 - 1221.523 George Cozma

And if the eCourse had had a bare-bones implementation,

0
💬 0

1228.496 - 1247.143 Bryan Cantrill

Yeah, interesting. And the fact that it's not there at all really does, it is, as you say, it's a separate ISA. Yeah. Is that a good segue to the AVX-512 improvements on Tornax? I mean, I felt like going into this launch, I mean, I felt that's one of the headliners was the improvements to the AVX-512.

0
💬 0

1247.163 - 1278.7 George Cozma

Yeah. Alexander Yee, the creator of Y Cruncher, did a very good write-up on Zen 5's AVX 5.12 implementation. Oh, yeah. And he went very, very into it and basically said, yeah, this is the best AVX 5.12 implementation so far.

0
💬 0

1282.56 - 1294.064 Bryan Cantrill

Yeah, really. There are some... The data path, obviously, is a big part of that, I assume. The fact that it's going from a 256-bit wide data path to a 550-bit wide data path. Or deeper than that.

0
💬 0

1294.084 - 1329.729 George Cozma

That's part of it. Another part is just the increase in the number of registers. They doubled the number of registers. Yeah. They made a lot of ops single cycle, which is nice. There were some trade-offs that were made. Some of the integer stuff was made to cycle, which was a bit of a cavity in the tooth, so to speak, fly in the ointment. But other than that, it's...

0
💬 0

1332.281 - 1359.261 George Cozma

The way that AMD can just not have to... So with Intel, you always had that sort of clock offset where if you run any AVX-512 code, you would suddenly decrease in your clocks, right? AMD doesn't have that. How they accomplish that, I have no idea. But you can throw in AVX-512 instructions and...

0
💬 0

1361.255 - 1389.928 George Cozma

thermal and power, um, like clock speed, um, pullback, it won't have this turbo clock thing where you have, where, where if you introduce any AVX 512 instructions, even if they're just loaded store instructions. they'll decrease the CPU clock regardless. You don't have that with Zen 5 or Zen 4 for that matter.

0
💬 0

1390.189 - 1401.921 Bryan Cantrill

And was that on, was that Sandy Bridge or Haswell, Robert, where it was like, I mean, AVX-512 has always been kind of had this kind of problematic property that if one thread starts using it, it kind of like browns out the rest of the part.

0
💬 0

1403.602 - 1404.183 Robert Mustacchi

It's Skylake.

0
💬 0

1404.503 - 1430.416 Adam Leventhal

Skylake is what did that. But you also had this problem on Broadwell with AVX2 and the others. And it's actually worse than just running an instruction. If you actually just leave the AVX save state bit such that Intel thinks it's modified in the register state, that's enough to trigger this slowdown sometimes. Oh, man. You might not remember this. We had a nasty bug back at Joyent.

0
💬 0

1432.017 - 1451.185 Adam Leventhal

where we had a guest in windows and we just weren't properly clearing one part of the save state in the initial state. So the initial state basically was like, you know, like the two 56 bit op masks are the two physics, but register state YMM state is valid. It's like, okay, I'm going to no longer boost.

0
💬 0

1452.866 - 1459.869 Bryan Cantrill

Wow. And so you basically run this guest and it would like crater performance for the whole box for no actual approachable gain. Cause no one's actually using it.

0
💬 0

1460.916 - 1472.099 Adam Leventhal

Yeah. I mean, basically just that one for, it was just for it, but it was just, that's, I think that to me is kind of the even more telling, but even if you just leave the state in the save state, then you're, you're toast.

0
💬 0

1472.86 - 1482.823 Bryan Cantrill

And it's really hard to have a feature like that, where if you use this feature, it has this, this kind of this adverse effect on the rest of the monkey's paw kind of feature.

0
💬 0

1483.283 - 1487.544 Aaron Hartwig

Yes. Yeah. Yeah.

0
💬 0

1489.571 - 1508.296 Bryan Cantrill

It's very hard to reason about the performance when you have these kind of problems. So, and the, and AMD is not needed. And I mean, you know, and it sounds like George, you've got the kind of the same question of like, you just come so accustomed to these kinds of intense compromises that come with AVX 512. It's kind of amazing that we can have it all.

0
💬 0

1510.937 - 1538.586 George Cozma

Yeah. It's, How do I put this? So I think what Zen... And if you read the initial sort of coverage of Turin... Sorry, not of Turin, of Granite Rapids. So that's Intel's newest Xeon chip.

0
💬 0

1538.826 - 1539.066 Bryan Cantrill

Right.

0
💬 0

1541.008 - 1576.261 George Cozma

They're... All the coverage was like, yeah, it's good. It competes with Genoa. But we were all briefed before this. And we were just sort of thinking, yeah, but is this actually going to compete with what's coming up next? And it's, let's put it this way. At least AMD isn't competing with itself anymore, if you get what I mean. But it's not a good competition for Intel.

0
💬 0

1576.381 - 1584.646 George Cozma

Like, it's not a winning one for Intel. They can at least bid on something and not be laughed immediately out of the room, but...

0
💬 0

1586.607 - 1611.732 Bryan Cantrill

Yeah, and you would expect that to like, I mean, flip a little bit with Sierra Forest, but then you end up with this kind of this e-core business. And I mean, I think there's gonna be, I mean, there's gonna be things that are gonna be interesting over there, but touring is a very hard part to compete with. It's done a pretty good job across the board. Yeah. Yeah.

0
💬 0

1614.193 - 1627.123 Bryan Cantrill

And so I think, George, I'm not sure you got to all three of the things that you had that were, so the frequency, the high number just in terms of the F cores and getting that up to five gigahertz, especially across all cores.

0
💬 0

1628.284 - 1648.617 George Cozma

And the fact that there's only two 500-watt SKUs. I actually really like that. The fact that while they are going up to that 500 watt skew, they're only really leaving it for the highest end parts. Whereas everything else is 400 or below. And I really actually respect that.

0
💬 0

1650.533 - 1670.082 Bryan Cantrill

I can tell you that we at Oxide also like that. Folks think about the kind of the rack level, because we're kind of left with the rack level power budget. And yeah, we definitely, it's nice to have a SKU stack that is not all sitting at 500 or 500 watts plus, right? I mean, I think of that.

0
💬 0

1670.142 - 1695.934 George Cozma

Yeah, yeah. It has... 400 watts is still a large amount of power, no doubt. But if you remember the slide that AMD showed with a seven to one consolidation. If you could do that, right? If you can go from a thousand racks of 8280s to 131 racks of- Oxide, what?

0
💬 0

1703.785 - 1716.523 George Cozma

Yeah, but if you can do that reduction, even though the single SKU power has gone up, your total power savings has gone, like your total power of the data center has gone down.

0
💬 0

1717.779 - 1740.418 Bryan Cantrill

It has, and I think economically, too, it's interesting. These are expensive parts, but they can do so much, especially at that high core count level, when you're not having to sacrifice on what those cores can go do, that you can make it make economic sense, I think. Yeah. it's a big step function over where they've been.

0
💬 0

1740.458 - 1748.261 Bryan Cantrill

I mean, I think that we, you know, I think like a lot of people, like the Genoa SKU stack was a little less interesting for us.

0
💬 0

1748.481 - 1783.585 George Cozma

It was obviously a lot more focused towards the high end. Yeah. Like it was very much skewed, pun intended, towards the hyperscalers, right? Towards those people who can take that much power and just not care. But I think especially with the Turin SKU stack, I think it's a lot more sort of a refresh across the board. Yeah. Which is really good. Yeah.

0
💬 0

1785.734 - 1808.871 Bryan Cantrill

Yeah, it is a good segue, Robert, to the kind of our thinking on Turin, because we, so George, as we're kind of thinking about our, I mean, our next gen sled is obviously a Turin-based sled. We did deliberately elect to kind of bypass Genoa and to intersect with Turin. Maybe describe our thinking a little bit there, Robert, and we've got

0
💬 0

1810.134 - 1815.698 Adam Leventhal

Sure, yeah, we can go a bunch of different places. I think the starting place is really actually going back to Milan.

0
💬 0

1816.059 - 1816.259 Bryan Cantrill

Yeah.

0
💬 0

1816.279 - 1832.672 Adam Leventhal

Because actually like 64 core Milan, like the 7713P, or even if you go up a little bit, getting that in 225 watts or 240 watts, that was actually really nice. Really nice, yeah. That was really nice. Performance per watt on Milan is really pretty great.

0
💬 0

1833.232 - 1842.883 Adam Leventhal

Uh, yeah, I mean, I definitely, I mean, when you get, you know, it's hard to compare to the, you know, 192 Zen five C cores, uh, in, in that range. But, um,

0
💬 0

1844.618 - 1870.74 Adam Leventhal

Once you kind of get to Zen 5, I mean, I still miss that there's no 225-watt, 64-core part, but I think this is where you're going to see this from our end, trying to think about how do we get a little more flexibility, leverage the fact that you have base DRAM has increased in capacity without going to 3D-stacked RDMs, just because that part of the balance and price equation starts getting...

0
💬 0

1871.821 - 1895.162 Adam Leventhal

really thorny. So the fact that you have 128 gig RDIMS, um, is useful, especially when you start looking at the fact that two DPC stops making, gets challenging, uh, fast, uh, there. So I think there's a bunch of different skews you can kind of start to look at. You know, I think one thing I've been keeping my eye on is actually the 160 core, um, Zen five C, uh, as one thing to look at.

0
💬 0

1895.202 - 1912.6 Adam Leventhal

Cause like that kind of keeps you below 400 Watts, uh, So I think it's still like a group E CPU as opposed to a group G in the IRM. So do you want to describe a little bit of those terms? When you say the Group G versus Group E, what are those things?

0
💬 0

1912.96 - 1939.316 Adam Leventhal

Yeah, when AMD creates a new socket, they put out what they call an infrastructure roadmap or IRM, and then they basically are predefining different TDP ranges into these groups. So, for example, Group E probably has some range off the top of my head from like 320 to 400 watts. Um, these new 500 watt, uh, CPUs, I sometimes joke are group G guzzlers, uh, just cause they, they definitely take a lot.

0
💬 0

1939.696 - 1963.084 Adam Leventhal

Um, but you can design your platform to different TDP kind of thresholds, uh, these kinds of different infrastructure ranges, and then you'll get different kind of CPU and core counts. So like, I think if we look at, um, there's like three or four different 64 core CPUs. I think there's like the 95, 35, uh, which is kind of like the, uh, you know, almost 300 watt, uh, 64 core.

0
💬 0

1963.564 - 1974.009 Adam Leventhal

It's like, that'll be like a group a CPU. Yeah. And that kind of gives you what the tells you kind of what the TDP range and what you as the platform designer can kind of tweak, uh, from what the min to max is on there.

0
💬 0

1974.469 - 1989.317 Adam Leventhal

Um, whereas like the others will come to group E or some of these even smaller ones, um, like some of the 32 and 16 cores, if they're not getting cranked up for frequency might even be in like group B, uh, And related.

0
💬 0

1990.558 - 2002.949 Bryan Cantrill

But as we were thinking about Cosmo, Cosmo is our codename for our next-gen Turin-based sled, SP5-based sled. What were we thinking in terms of what groups we wanted to target and kind of like the trade-offs there in terms of flexibility?

0
💬 0

2003.777 - 2018.831 Adam Leventhal

Yeah, that's a good point. So I think a lot of what we do is a bunch of work that we do with actually Eric, who's also on here. We're trying to figure out, hey, what's the right balance of how many stages, how many components do we need to kind of reach what kind of what power group?

0
💬 0

2018.851 - 2040.828 Adam Leventhal

So, for example, if we designed when we did SP three, we designed what they called Group X, which was the group they added later for the 3D V cache and like max frequency skews. Maybe it's like a 240 or 280 watt max. But then we ran kind of a 225 watt CPU in there the entire time, giving us plenty of margin, plenty of headroom, which meant that, you know, our power subsystem was very clean.

0
💬 0

2041.409 - 2061.64 Adam Leventhal

So here we kind of are saying, hey, let's let's you know, we said, hey, we're going to start with group B as our target. We're going to see what does it cost us to fit group G? You know, does it actually cost us more stages, more inductors, more? more other parts. And, you know, then the first question of, can we cool, can you air cool 500 plus Watts, which is a different question entirely.

0
💬 0

2063.101 - 2068.643 Bryan Cantrill

Yeah. Eric, you want to describe some of the kind of the thinking there is as you're, as we were looking at the, what the PDN for this thing was going to look like.

0
💬 0

2070.104 - 2086.776

Yeah. So generally the, the trade-off is, um, not even so much cost, it's space. So if you look at that, in the chat there was a server that had two sockets and like 48 DIMMs or some insanity on it. And if you look at that thing, there's a whole lot of power stages in it.

0
💬 0

2086.916 - 2105.794

And if you look at the board designs for those things, having that many power stages basically creates a giant wall for you to route around, which sucks. So you have to pull out all those lovely PCAE lanes we love to use and route them around all those power stages because they don't really like going through them.

0
💬 0

2106.134 - 2109.438 Bryan Cantrill

And when you mean route, we're talking the physical layout.

0
💬 0

2109.458 - 2110.339

Traces on the PCB.

0
💬 0

2110.359 - 2111.9 Bryan Cantrill

Yeah, traces on the PCB, right.

0
💬 0

2112.681 - 2139.262

Yeah. And so for our power design, I tend to bias towards the conservative side of things. One, because we're not building a million of these and I don't have a finance person beating down my door over $5 in extra components. And two, because it gives us flexibility, right? So if we wanted to run those 500-watt chips, I'm going to be able to do that. Yeah.

0
💬 0

2139.949 - 2163.144 Bryan Cantrill

Yeah, and I feel that, like, also, Eric, I think whatever size Oxide becomes, even if we're selling millions of them, I will help slay the person that comes to your door on the 5. Because I feel like on so many of these parts, I mean, yes, they add up and it's part of the bomb. But, man, look at the cost of these CPUs is so much greater. And getting the flexibility is so much more important.

0
💬 0

2164.267 - 2189.204

having the reliability of having margin yes not having to worry about it it's like okay oh you know i i can push it down to like four millivolts of margin to the spec it's like but but why so i can save five dollars like that's dumb don't do that and then so i'd love to get a bunch of commodity servers and just start throwing them through uh power testing and see what they do oh well i don't know how close they are

0
💬 0

2190.06 - 2206.824 Bryan Cantrill

Oh, and I think, I mean, George, I'm sure your experience has been up there, but I do love a bunch of the reviews online of Turin cautioning people to not do exactly this. Like, by the way, your SP5 motherboard may not be able to take some of these SKUs.

0
💬 0

2208.144 - 2229.717 George Cozma

Yeah, basically what AMD has said is there's these 500-watt SKUs But if you are an end user, like if you're a small-medium business and you bought, say, four Turret servers from Dell or whoever, don't just swap out the chips.

0
💬 0

2230.257 - 2235.26 Robert Mustacchi

Please make sure that your boards can actually support these chips.

0
💬 0

2236.55 - 2249.4 Bryan Cantrill

Yeah. And the, and having like, and because the problem too will be that if you push these things to the margin, I mean, you, you can get like misbehavior. It's not, it won't be as simple as like burning the house down.

0
💬 0

2249.42 - 2266.173 George Cozma

And you can get some very weird, weird, bizarre behavior that you're just going, why is it doing this? And you'll tear your hair out for a week trying to figure it out. And it's just because of the power.

0
💬 0

2267.335 - 2294.276 Bryan Cantrill

Yes, or if you recall us on the Tales from the Bring Up Lab episode where Eric was regaling us with some of our adventures on Gimlet, where our power was already pretty good, but we could not figure out why this chip would reset itself after 1.25 seconds. So we made our power even better. And Eric, my recollection of this was AMD's like, we have never seen power. This power's amazing.

0
💬 0

2294.476 - 2298.439 Bryan Cantrill

You've got amazing margin. It can't be that. And sure enough, it was not that. It was firmware, of course.

0
💬 0

2301.161 - 2318.592

Oh, yeah. It was firmware on a power stage. Oh, on a power stage. Yes. Yeah. Yeah. It was the control interface that the AMD processor uses to tell the power stages what voltage it wants. Turned out to need a firmware update. It's very much a face-slapping thing.

0
💬 0

2320.007 - 2342.84 Bryan Cantrill

George is kind of hilarious because we were not, the SVI2 was the protocol this thing uses, that the part uses to talk to the regulator. And the SDLE, which we had used, this great part from AMD that we had used to actually model all this stuff, as it turns out, didn't have a hard dependency on getting the ACK back from the controller when we set the voltage to a specific level.

0
💬 0

2342.9 - 2353.687 Bryan Cantrill

The part, as it turns out, wants to hear that ACK, as we learned. We learned that the hard way. We learned that the hardest, most time-consuming, most expensive possible way. But we did learn in the process.

0
💬 0

2353.707 - 2374.122 Bryan Cantrill

I actually, Eric, I thought it was super interesting to learn that our power margins were really good on that because that was like a first natural line of attack was our power margins aren't like, that's why this thing is resetting because it is in a reset loop because our power's not good enough. But we actually learned in the process of doing that, like, no, no, this power's actually quite good.

0
💬 0

2375.492 - 2378.894

Yeah, it was rock solid and it was just stupid margin.

0
💬 0

2380.035 - 2390.921 Bryan Cantrill

So, Eric, as you're kind of thinking about like, okay, so we need to, you know, there are things we need to do. And were you coming to the conclusion that, okay, I think we can make this all fit? I mean, as you're doing that kind of that trade-off?

0
💬 0

2392.742 - 2411.064

Yeah. And so the big trade-off is, okay, are we going to have customers that need it? Are we going to even want to run a 500 watt? Can we air cool 500 Watts? Cause we're not water cooling. And it to me is the power person.

0
💬 0

2411.084 - 2428.307

It basically came down to, it doesn't hurt anything for me to put another power stage on this thing and I can always turn it off and then it's just not doing anything, but it's also not contributing any heat to the system. So if I wanted to, I could turn it off and I wouldn't pay that much of a penalty.

0
💬 0

2429.051 - 2445.454 Bryan Cantrill

And importantly, we're able to use the same regulator that we're, it's not like we're having to swap regulators to accommodate this. We were able to use the same Renaissance parts for this. And then the, and then from a thermal perspective, so we then, okay, so that's kind of like, all right, we've got the insurance there.

0
💬 0

2446.275 - 2464.461 Bryan Cantrill

And then from a thermal perspective, we also needed to do the, because you said we're not water cooling this thing. So, you know, can we, and at 500 Watts, I think we definitely know we will not be talking about how quiet the fans are because you'll be lucky if you can hear us talk over the fans. Yeah.

0
💬 0

2464.541 - 2470.263 Bryan Cantrill

We know the fans be cranking, but I think that the, I mean, we've done that and Doug and crew have done the model.

0
💬 0

2470.844 - 2472.164

I'm calling it. We can air cool it.

0
💬 0

2472.677 - 2472.917 Bryan Cantrill

Yeah.

0
💬 0
0
💬 0

2473.437 - 2474.878 Adam Leventhal

No, that's the thing.

0
💬 0
0
💬 0

2476.259 - 2497.427 Adam Leventhal

I mean, right now we've done all of our, our worst case studies, which is basically saying, assume the CPU is going 500 Watts, right? All the dims are going at their maximum. You've got every SSD going at its maximum. and the NIC, and some amount of loss, you're paying some amount of loss for all the stages, we still think we can cool that.

0
💬 0

2497.547 - 2506.529 Adam Leventhal

And then practically speaking, even though the CPUs with turbo boosting have a good way to eat up the rest of your power, you're usually not getting all of those devices maxed out all at the same time.

0
💬 0

2506.669 - 2515.212 Bryan Cantrill

It is in particular, like it's real hard to max out the draw on your DIMM and the draw on your CPU at the same time without being mean spirited. I mean, you have to be really...

0
💬 0

2517.212 - 2519.893

Maybe X512 will let us do it this time.

0
💬 0

2520.033 - 2524.495 Adam Leventhal

They've also gotten a lot more clever about how they do all the hashing across stims.

0
💬 0

2524.875 - 2528.797 Bryan Cantrill

Substantially so. You're right. I should not be tempting the gods here.

0
💬 0

2531.098 - 2549.061

One thing that surprised me coming into this outside of the industry and coming into this is seeing like, okay, you got your 500 watt TDP, but that's not actually the peak power you can draw. They can draw over 800 watts transient. as they're scaling things up and down. And that just, it just blew my mind.

0
💬 0

2549.101 - 2562.947

It's like, wait, this 500 watt park and just spike up over 800 Watts and then scale itself back down. Like the, like, yeah, that's great thermally, but dammit, I gotta, I gotta provide that. Right. Right.

0
💬 0

2563.047 - 2579.775 George Cozma

Yeah. And with the voltages that current server CPUs are running, you're having, at those 800-watt spikes, it's not 800 amps. It's 1,000-plus amps, which means more power stages.

0
💬 0

2580.756 - 2583.956

And just turning your board into a giant resistor, essentially.

0
💬 0

2584.337 - 2584.577 George Cozma

Yes.

0
💬 0

2586.357 - 2592.459

It turns out copper is not like 1,000 amps running through it in normal PCB thicknesses.

0
💬 0

2593.613 - 2605.73 George Cozma

So actually, sort of the thing sort of related but unrelated to TURN directly at the event that was announced was Pensando, new Pensando stuff.

0
💬 0

2606.539 - 2606.759 Bryan Cantrill

Yes.

0
💬 0

2608.301 - 2610.504 George Cozma

And I sort of want your take on this.

0
💬 0

2612.126 - 2624.401 Bryan Cantrill

Interesting is our take. I mean, it's like definitely interesting. I mean, I think that we would love to be able to get some parts. The draw does become an issue back there for us. Yeah.

0
💬 0

2625.582 - 2644.9 Adam Leventhal

Yeah, I mean, the P4 programmable nature of it for us is something that's actually really powerful. We leverage that in our switching silicon a lot and have been looking for something to get that into the NIC. The big challenge is just, I think where we're a little different is a lot of the DPUs have been designed to basically be like, we're the compute, the DPU is the computer in charge.

0
💬 0

2645.66 - 2661.844 Adam Leventhal

And Hey, you big, big CPU. That's like running guests over there. Like, uh, you're subordinate to me. So like, yeah, you don't, you don't, you know, you exist, but like only at my pleasure. Uh, and we were not quite as, uh, split brain, uh, there slash we're not trying to sell the entire server.

0
💬 0

2661.964 - 2668.126 Adam Leventhal

So like, you know, it just gets thorny when it's like, okay, that, that device also needs its own DDR five.

0
💬 0
0
💬 0

2669.126 - 2670.807 Adam Leventhal

Uh, some questions around like, Oh, but yeah,

0
💬 0

2672.367 - 2678.634

Think of how much it could offload your processor. It's like, yeah, but the processor is still going to get maxed out. So now I've just increased my overall power.

0
💬 0

2679.014 - 2695.769 Adam Leventhal

Totally. Yeah. So when we end up designing for kind of not absolute density, but trying to get the best density in a fixed power budget, which because, you know, unlike the hyperscalers, we're not basically building a power plant next to every new DC. Yeah. that that's where it gets a little more challenging.

0
💬 0

2695.789 - 2713.114 Adam Leventhal

And so we're trying to work with folks to figure out, you know, Hey, if I don't need, say all of the arm cores that show up there, or let's say I didn't run with DDR five, you know, where can I, what can I get? What can I, can I still get out of there? You know, how can we kind of change this from, you know, some of these parts are, and I don't remember what this one was, you know, or 50, 75 Watts.

0
💬 0

2714.074 - 2726.258 Adam Leventhal

And, you know, that's that, or, you know, maybe I, I play games and I say, um, you know, I've got a lot of SSDs, but maybe I don't need all of the IOPS, all those SSDs. So I can double up, you know, capacity instead.

0
💬 0

2727.039 - 2741.785 Adam Leventhal

And that gets me back some of the power and I can send that to the neck, but it's definitely, uh, we're not in a, you know, even with just increasing power for the CPU, I'm already trying to think about like, well, what do I do for folks who don't have all that power? If I've got 32 sleds, how do I, uh,

0
💬 0

2742.125 - 2761.136 Bryan Cantrill

Well, and I think that, you know, and Ellie in the chat is saying, well, I don't think people realize how, you know, how restrictive the oxide power budget is. And I don't necessarily, it is restrictive. It's more that we are really, we are taking that rack scale approach. And so we're kind of the ones that are like always adding up the visa bill.

0
💬 0

2761.657 - 2787.055 Bryan Cantrill

And, you know, when you have, you know, 30, 40, 50, 60 watts, 70 watts, 80 watts in your neck, like that adds up in a hurry. And yes, you can offset it elsewhere. But what we're trying to do is try to get you the maximum amount of useful work out of that rack scale power budget.

0
💬 0

2788.136 - 2798.782 Bryan Cantrill

And, you know, by being the ones that are doing that, we're the ones that are, you know, sometimes having to deliver some tough messages to folks about like, we like this, this is interesting, but it's drawing way, way, way too much power.

0
💬 0

2800.182 - 2811.789 Adam Leventhal

But yeah, overall, really good, really nice to see, excited to see that kind of P4 continue there and hoping someday we can find a way to make it make sense for us. But I think there's a lot of other folks who it does make a lot of sense for.

0
💬 0

2811.969 - 2837.307 Bryan Cantrill

Yeah, I mean, we love, we're huge P4 fans as folks know. And I think we've got... Actually, if folks can see it out there, we've got an exciting announcement in terms of Excite Labs and using their part as our next-gen switch, and it's P4 programmability. So we're really excited. We've been using P4 on the switch. We're going to continue to do that.

0
💬 0

2837.627 - 2853.617 Bryan Cantrill

And using P4 or programmability at the NIC, we're really interested in. But it's got to happen in a way that we can accommodate everything else we need to go do with the rack. So, George, a long answer to your question there, but it's interesting for sure.

0
💬 0

2853.757 - 2870.862 George Cozma

So speaking of how you guys don't want to do DPs, because AMD didn't just announce DP, which is Pensando Selena 400. They also announced Polara, which I think is just a standard NIC. Like it's P4, but it's not a DPU.

0
💬 0

2871.342 - 2877.57 Bryan Cantrill

Yeah. Which is something that's going to be, and that is something that's going to be like, we're definitely interested in.

0
💬 0

2878.431 - 2882.377 George Cozma

Okay. Yeah. Cause I was going to say, is that the one that you're more interested in?

0
💬 0

2882.757 - 2896.222 Bryan Cantrill

Yeah, I think so. I mean, that's not going to intersect our first cut here of Cosmo. But no, we're really, really interested in it. And again, great to see that P4 programmability.

0
💬 0

2896.242 - 2911.407 Bryan Cantrill

There were some moments where it felt like we were a bit of a lonely voice, but I think other folks are beginning to realize, and I think as the hyperscalers themselves have known, that having that network programmability is really essential.

0
💬 0

2912.601 - 2919.927 Aaron Hartwig

Brian, we've talked about P4 in the past, but for folks who maybe haven't listened to the back catalog, should we give a little overview? Sure.

0
💬 0

2920.708 - 2924.331 Bryan Cantrill

Robert, you'll do the honors. I'm happy to give my P4 spiel.

0
💬 0

2925.732 - 2945.967 Adam Leventhal

Sure, I'll see if I can do it justice. Effectively, the way I think about P4 is it basically is a programming language that you can use to compile a program that operates on the Nix data plane. And I think this is an important part because for a lot of these things, the value is to actually run at line rate.

0
💬 0

2946.167 - 2968.08 Adam Leventhal

So you've got 100 gig, 200 gig, 400 gig, especially with all these 112 gig 30s coming along. You basically can't necessarily treat that as a general purpose program that's coming in, DMAing everything back to normal, you know, to a normal core's memory and processing it and sending it back out. But instead, this kind of lets it process the packets kind of in line in that hardware receive path.

0
💬 0

2970.413 - 2996.962 Bryan Cantrill

And having that higher level of abstraction really allows you to kind of express something programmatically that they can then use hardware resources efficiently. Our big challenge has been working with vendors to give us a substrate upon which we can build a true P4 compiler. Honestly, the biggest challenge in that part of the ecosystem has been the proprietariness of the compilers.

0
💬 0

2997.242 - 3016.33 Bryan Cantrill

So, George, I'll tell you that one thing that will be a factor for us as we're looking at kind of a P4-based NIC is what we have been looking for is what is that kind of x86-like substrate going to be? Something that is a documented, committed ISA that we can write our P4 compiler against.

0
💬 0

3016.53 - 3038.199 Bryan Cantrill

So what we are not looking for, because we are coming out of a bad relationship in this regard, what we are not looking for is kind of a proprietary compiler. We really... uh, want to, and we have written our own P4 compiler for, we use our, we developed our own P4 compiler and have open sourced it, um, X4C, um, for purposes of just doing development, software development and testing and so on.

0
💬 0

3038.779 - 3053.885 Bryan Cantrill

But we really want to take that and have that, use that to actually, uh, to compile, uh, for these parts for both the switch for sure. Um, and then ultimately the NIC, um, would definitely be our vision for where it's going.

0
💬 0

3056.63 - 3071.489 George Cozma

yeah um and i know you guys as were a lot of us disappointed with uh intel discontinuing the tofino line of switches.

0
💬 0

3071.669 - 3087.957 Bryan Cantrill

George, I really appreciate your sensitivity of taking us into the kind of the grieving room and your bedside manner here is really exemplary. I was really feeling you kind of passing the tissues to us as you, as you really felt our loss. I really appreciate that.

0
💬 0

3088.618 - 3103.173 George Cozma

Yes. But yeah. And one question I have One question I have with AMD is, does it make sense for them to make their own Switch eventually?

0
💬 0

3106.149 - 3111.093 Bryan Cantrill

I mean, are we in charge of AMD now? Because we got lots of ideas.

0
💬 0

3112.114 - 3134.368 Adam Leventhal

Actually, George, this is the pitch I've tried to make to them. Just because to me, it's like if you actually look at NVIDIA and what they've done with NVLink and basically buying Mellanox, at the end of the day, to really be able to you know, deal with that, what they're doing with ultra ethernet, it feels like you have a P4 engine. Yes. It's going to be a big change.

0
💬 0

3134.388 - 3156.579 Adam Leventhal

Take it from, you know, the Nick kind of two cert, you know, two port form factor to, you know, a switch ASIC and kind of dealing with power. But I think that if you really want to do well in that space, you can't rely on just like, hey, I'm going to convince Broadcom to let me pass through, you know, XGMI, you know, through my switch.

0
💬 0

3157.239 - 3161.524 Adam Leventhal

Or whatever they're calling that Infinity Fabric transport these days.

0
💬 0

3161.684 - 3163.666 George Cozma

It's now UA Link.

0
💬 0

3164.326 - 3165.067 Adam Leventhal

Yeah, exactly.

0
💬 0

3165.087 - 3167.63 George Cozma

Which, ironically enough, Intel is now a part of.

0
💬 0

3169.804 - 3185.976 Adam Leventhal

Right. So yeah. So, so yeah, I think it's like, it's good that you have that consortium and you'll be able to push some stuff there. But I also feel like at the end of the day, you know, where you see a lot of value from Nvidia is that they are building, you know, where they've been successful because they have vertically integrated a whole lot of that stuff. Yes.

0
💬 0

3186.616 - 3206.458 Bryan Cantrill

Yeah. And, you know, so yeah, I mean, absolutely. We would be great for them to do that. Although that said, we are, you know, they need to do it in the right way. And the right way from our perspective is really establishing a substrate that people can build an open ecosystem on top of. And this is something that, you know,

0
💬 0

3207.439 - 3232.698 Bryan Cantrill

And I always, I find vexing is you would think if you make hardware, it's enormously in your best interest to allow many, many software stacks to bloom, um, by having a well-documented committed interface. But, um, They really don't. It's a challenge, I would say. I wouldn't say they don't, it's too reductive. I think that they fight their own instincts on it.

0
💬 0

3234.278 - 3255.749 Bryan Cantrill

And so we're very excited with Excite Labs. Again, you can see our announcement today or their announcement today, actually, but it features Oxide for sure. And we definitely see eye to eye with them on their X2. We're looking forward to to moving forward with that part. And we think that there should be, we want to see programmable networking everywhere.

0
💬 0

3255.789 - 3290.1 Bryan Cantrill

We want to see this open ecosystem everywhere. The, on the note of like the kind of the lowest levels of the platform that can be hard to get at. So George, you may recall that we have no bias in our system. So there is no AGISA, there is no AMI bias. So when we buy us, buy us, am I? Nailed it. Oh, thank you. We have lots of bias. Up and at them, up and at them. Uh, better.

0
💬 0

3290.781 - 3298.995 Bryan Cantrill

Um, do you wake up your kids that way? Adam, as, as when you're at castle, we're near wolf castle saying up and at them.

0
💬 0

3299.864 - 3302.885 Aaron Hartwig

I'm not as good a parent as you in that regard.

0
💬 0

3304.646 - 3324.533 Bryan Cantrill

Feels very loaded. I think my kids have definitely gotten sick of that particular Simpsons reference. Rainier Wolf Castle, no longer welcome in our abode. But we have no BIOS. And so Robert, that lowest level platform enablement has fallen to us. What are some of the differences in Turin from, or even from Genoa, but then from Milan?

0
💬 0

3325.205 - 3345.67 Adam Leventhal

Yeah, I think actually we've been talking about PCIe a bit. So I actually think one of the things that I find has been both fun and sometimes a little vexing, but is ultimately good for the platform, not always as fun for us in how the register numbers sort themselves out, is that they've actually increased the number of IOMS entries in there.

0
💬 0

3345.71 - 3371.11 Adam Leventhal

So basically in the past where you had a group of 32 PCIe lanes, which are basically two X16 cores, They were consolidated into one connection to the data fabric. Actually, one of the more interesting things is that we've seen that in Turin, each X16 group is connected to the data fabric independently through its own kind of IOMS slash IOHC, which are all, I guess, internal data items.

0
💬 0

3371.35 - 3373.713 Bryan Cantrill

Yeah, those are effectively hidden cores, right?

0
💬 0

3373.793 - 3399.373 Adam Leventhal

Or those are core-ish? I mean... Yeah, I don't know how much. I'm sure there's a Z80 hidden in everything, or an 8051. So I'm sure everything's a core at the end of the day. But actually, if you just kind of look at it, this part is less hidden. Because if you just look at, hey, show me the PCI devices on Turin, you'll see, hey, there's eight AMD root complexes where there used to be four.

0
💬 0

3399.513 - 3408.298 Bryan Cantrill

There used to be four. Yeah, interesting. Yeah. And that is presumably, so you are, you're just increasing the parallelism there. And I mean, is that the.

0
💬 0

3408.318 - 3419.365 Adam Leventhal

Yeah, that would be my theory is that basically it's getting you more because there's more data fabric ports that you can have just more transactions in flight to different groups of devices.

0
💬 0

3421.825 - 3435.194 Bryan Cantrill

And so but but those kind of changes, which if we were at some level of software, that's an implementation detail you don't need to see. But at the level of software we're at, like you actually need to go accommodate those differences.

0
💬 0

3436.635 - 3464.74 Adam Leventhal

Yeah, yeah, that's definitely it. Otherwise, it's Milan to Genoa was more. There are more changes than Genoa to Turin. Interesting. In kind of some of the lower level stuff. Some of these kind of bits like how do you do PCIe initialization, hot plug have stayed more the same. From Genoa. From kind of Genoa to Turin. Yeah. They have some different firmware blobs that you talk to. So like...

0
💬 0

3465.8 - 3486.347 Adam Leventhal

The SMU interfaces stay the same across these, but they moved to a new what they call MPIO framework, which is what goes and programs the DXIO crossbar. PCIe device training is kind of a collaborative effort between that core and X86 cores.

0
💬 0

3486.467 - 3490.228 Bryan Cantrill

And could you describe what device training is? Like, why does a link need to be trained? What is that?

0
💬 0

3490.248 - 3512.215 Adam Leventhal

Yeah. So there's, there's two different pieces here. So if you see AMD, first off, when they sell, you know, in their, all their makers say, Hey, we've got 128 PCIe lanes, which is great. Uh, But the first thing you have to figure out is, well, actually, how do those work on the board? I've got, are these X16 slots? Are they X4 slots that are actually connected to an SSD?

0
💬 0

3512.335 - 3537.169 Adam Leventhal

What's their size and width, and how do they actually fit across the board? So one of the first things that everyone has to do is they kind of will tell the AMD's firmware, hey, here's how this is actually connected. you know, these logical, these physical fives, you know, I've got an X 16 slot. I've got, you know, in our case, we've got 10, uh, X four slots for basically every front facing you.

0
💬 0

3537.189 - 3553.702 Adam Leventhal

Dot two, right. You know, an X 16 slot for a Nick. Um, you'll have other things for other folks. Or if you have a kind of like a board, like showed up in the chat, you know, you've got some number of X 16 slots that map to things, some, some probably M dot two slot. So you have to tell it what is all, you know, what all is there.

0
💬 0

3554.102 - 3576.65 Adam Leventhal

So it can basically go and reprogram the internal crossbar to say, okay, these lanes should be PCIe. You know, George mentioned earlier that when you have a two-processor configuration, you know, some of those lanes are being used for that. So that's part of it. If you use SATA, which I, you know... getting less and less common.

0
💬 0

3576.951 - 3578.091 Nathanael Huffman

Yeah, that's right.

0
💬 0

3578.191 - 3601.541 Adam Leventhal

Uh, you know, some of those lanes are come from those same PCIe lanes. So, um, that's the first step is you're kind of doing that. Then the next phase is after you've done that is, uh, if you open up the PCI, uh, base specification and, in like chapter two is a very long state machine. It has a lot of different states and a lot of different phases.

0
💬 0

3601.561 - 3620.416 Adam Leventhal

Obviously, what does it mean to basically have a PCIe device end up at the other end and have both sides be able to talk? Um, and so device training is basically going through that process, discovering, is there even a device there? Right. And from there trying to say, okay, let's start talking to one another, figure out how we can talk, then what speed we can talk at.

0
💬 0

3620.876 - 3638.865 Adam Leventhal

Once we're good at, you know, a certain speed, then they'll increase, um, to additional speeds, sending these things called ordered sets and training sets and lots of different acronyms that you hope generally just work and you don't need to think about. And then unfortunately, sometimes you do need to think about them.

0
💬 0

3639.133 - 3658.184 Bryan Cantrill

Right, when they misbehave. So there's a lot of low-level work that we need to go do. And how do we, in terms of like, we don't yet have, I mean, Eric and Nathaniel and crew are working on Cosmo as we speak, kind of finishing up Cosmo. How do we work on that before we have our own board in hand?

0
💬 0

3659.224 - 3676.379 Adam Leventhal

Yeah, so the main way we do this is that there are often reference platforms. So, you know, if you look at George's article, all of his testing was done on a volcano platform, which is the name of a platform that AMD developed that was specific for Turin. There's a couple older generations.

0
💬 0

3676.399 - 3679.162 Bryan Cantrill

Is it too much to hope that someone has a sense of humor that named that thing volcano?

0
💬 0

3680.523 - 3686.048 Robert Mustacchi

Honestly, I would not surprise me if somebody did have a sense of humor.

0
💬 0

3686.676 - 3693.718 Adam Leventhal

I mean, you had Volcano, you had Puroko. I'm trying to remember what the other two were. The ones before that were all metals.

0
💬 0

3693.758 - 3704.742 Bryan Cantrill

You had, like, Onyx. Yeah, no sense of humor. I just like the idea that, like, Inferno, you're going with the... So, yeah, so we got the Volcano reference platform from AMD.

0
💬 0

3704.802 - 3728.777 Adam Leventhal

Yeah, so we are doing ours mostly on a bunch of Ruby platforms, which were the ones that first came out for Zen 4. And so we have those, which gives us generally most of the schematics and other bits there. You generally get most of the firmware, but not all of it. So you can't always do all the things on the board that you think you should, like a reference platform that you do.

0
💬 0

3729.917 - 3739.585 Adam Leventhal

But that gives us a development platform. So we're fortunate that we were able to get some early silicon from AMD, so we could actually start doing development of that ahead of launch. Right.

0
💬 0

3740.365 - 3747.818 Bryan Cantrill

And then Nathaniel, do you want to talk about kind of how we use those dev platforms? Because we've got a little, a great little board there.

0
💬 0

3749.454 - 3774.046 Eric Aasen

Yeah. So notably we, uh, we're able to take the BM. So in Ruby AMD made this BMC board called a Hawaii board that has an a speed BMC and kind of all your, your traditional BMC stuff, but it's on an OCP card. So we can pull that out. And so we developed our own OCP, uh, form factor card. And we, uh, we call that the grapefruit.

0
💬 0

3774.76 - 3799.18 Eric Aasen

and that goes into that bmc slot and connects to the ocp connector but it has rsp and rot and uh xilinx fpga there and some flash and you know kind of basic so i3c level translators that kind of thing so we can kind of hotwire into the the ruby dev platform with our you know what is effectively our bmc topology and do some development there

0
💬 0

3799.973 - 3823.951 Eric Aasen

And so that's kind of... I've been working on Grapefruit a lot the last few months, I guess, and doing some of the FPGA work and trying to get the thing integrated so that Hubris runs on it and we've got, you know, our Ethernet stack runs on it and all of that. And that's all going pretty smoothly. And we're getting close to... being able to use our grapefruit board as an eSpy boot target.

0
💬 0

3824.012 - 3826.377 Eric Aasen

So I don't know that we have talked a lot about eSpy boot.

0
💬 0

3826.397 - 3830.045 Bryan Cantrill

Yeah, we should talk about eSpy because this is definitely a difference in Turin.

0
💬 0

3831.414 - 3853.667 Eric Aasen

Yeah, so Turin supports... You can do the standard Spynor boot, like all of these devices have done for a while. But they added eSpy boot, which is based off an Intel standard. And so it's an extension to eSpy, which is kind of like an LPC replacement. But it's an extension to that that allows you to have...

0
💬 0

3854.635 - 3880.116 Eric Aasen

what they call slave attached storage and or slave attached file storage or some flash storage something like that and That allows you to boot off of over the spy network and it's basically built exactly for the server use case So, you know you have a BMC or some kind of device sitting there and then the flash is hiding behind that and so you talk spy to that device and then it goes off and fetches the flash and does whatever it needs to do and

0
💬 0

3880.756 - 3898.528 Eric Aasen

you know out of spy nor and and or I guess you could you know do it off of NVMe or something if you'd like and You just feed it the bytes that it requested back over the east by interface and so and why that's how we're planning So so what's that what why is what why did we need to enhance spy to do that?

0
💬 0

3898.608 - 3899.729 Bryan Cantrill

Why can't we on spy?

0
💬 0

3900.31 - 3920.844 Eric Aasen

Well, I mean so you can talk spy to a nor flash, but that's that's basically all you can do is And the eSpy protocol kind of sits on top of what looks like a fairly standard spy, like QuadSpy interface. But it allows you to go request transactions, and then you just wait until the device goes and gets them. So, you know, Flash is notoriously slow.

0
💬 0

3922.02 - 3942.436 Eric Aasen

And so if you are the only device talking to a, uh, like a quad spy and you ask it for something or you ask it to go do an erase or whatever, you basically just kind of have to hang out and spin your tires until it finishes and wants to give you that data back. And over the eSpy interface, they do posted and non-posted transactions.

0
💬 0

3942.456 - 3964.926 Eric Aasen

So you can do these non-posted transactions and say, hey, I want a kilobyte of flash from this address. And you send that message. And then you can continue on using eSpy, talking to the device, doing other things, while whatever the eSpy target is goes off and does the work to get you your kilobyte of flash. And then it'll let you know with an alert that it has data for you.

0
💬 0

3964.966 - 3966.387 Eric Aasen

And you can come by and fetch it when you want.

0
💬 0

3966.854 - 3993.168 Bryan Cantrill

That's right. And spy, as my kids would say, spy has no chill. Spy, you need to give it what it needs. You're set on the clock. There's no clock stretching in spy. And so spy interposition becomes a real nightmare because you need to get everything you need to get done, you need to get done in that one clock cycle. It's like, yeah, that's really hard.

0
💬 0

3993.409 - 4019.937 Eric Aasen

Right. And on Gimlet, what we did is we put actual analog spy muxes in so we could flip between our A-boot and B-boot flash images. So we'd actually just swap between chips that way. With eSpy, none of these images are very large, so you end up buying a commodity flash part with, say, a gigabit of flash storage. And you only need 32 megabytes or something like that.

0
💬 0

4019.997 - 4040.142 Eric Aasen

You know, you need these small like PSP images there. So with eSpy, we can also go down to one flash part. And then the the FPGA that's acting like the eSpy target will just translate, you know, into the high or low pages basically of the flash. And but the AMD doesn't really have to know the difference. So that makes things a little simpler on our end.

0
💬 0

4040.885 - 4059.256 Adam Leventhal

Yeah, and the other nice bit there is that as you get into DDR5, one of the big problems is training time. Yes. So one of the things that AMD has is that you actually, after you train the first time, you actually end up writing back a bunch of this data into that spy flash. Yeah.

0
💬 0

4060.237 - 4072.523 Adam Leventhal

And if you have it, you know, without that virtualization, then if you're trying to kind of hash or figure out like, you know, how do I make sure this contents are all what I thought I wrote down and that it just gets gnarlier.

0
💬 0

4072.623 - 4082.127 Adam Leventhal

And this kind of just indirection layer, you know, computer sciences, you know, it's one contribution is adding another layer of indirection just to cheat just comes in handy.

0
💬 0

4082.472 - 4108.654 Bryan Cantrill

Yeah, and just on the training times, without that, without kind of, and so dim training where you're trying to find the search for the constants that are going to allow you to not have interference when you're talking to these dims, that search can take a long time. And Robert, how long does it take you when you've got the first, that first genoa that you've got-

0
💬 0

4111.02 - 4132.923 Adam Leventhal

Yeah, the first thing I had, which admittedly was early, A0 silicon. This is no shade on AMD. But I think it was one dim of GDR5, not that big. It definitely felt like minutes. It was 11 minutes, I believe, was the number. Was it really?

0
💬 0

4132.943 - 4142.029 Bryan Cantrill

Yeah, it was. I've put it out of my brain. Yeah, George, I don't know if you've seen some long boot times, but DDR5 takes a long time to train.

0
💬 0

4142.329 - 4153.776 George Cozma

So I know when I was first booting my 9950X system, I put in the dims, I turned it on, I went, grabbed a cup of coffee, came back,

0
💬 0

4155.433 - 4177.911 Robert Mustacchi

noticed it was still booting was like okay let me go feed my cats fed my cats came back was still booting it's like okay let me go run to the bathroom come back still booting i'm like is it actually booting like what's taking it and i was about to turn it off when i saw in the corner of my eye that my monitor flashed up i'm like okay i'm finally booted it's alive it's

0
💬 0

4178.666 - 4206.853 Robert Mustacchi

Yeah, I'm like, okay, thank God it's actually done. And I did screw something up because I had an update to BIOS, which is always a bit of a nerve wracking experience when you have to do it over just watching a flashing light go. And you're like, is it done? Are you done? Did you work? I hope it worked. And then you turn it on and you're just like, okay, hopefully this works.

0
💬 0

4208.407 - 4210.889 Bryan Cantrill

Adam apparently really prefers your pronunciation, George, to mine.

0
💬 0

4210.909 - 4215.832 Aaron Hartwig

Yeah, that was delightful. Suffering through bias, right?

0
💬 0

4216.493 - 4224.458 Bryan Cantrill

All right, I'll do my Duolingo with George on there.

0
💬 0

4224.498 - 4232.604 Eric Aasen

I will say, one of the other major reasons for using eSpy was that it gains us back our second UART channel, which we lost in Turin.

0
💬 0

4233.404 - 4255.438 Eric Aasen

uh because we like hardware handshaking and the second you are in turin doesn't have hardware handshaking and so we you know we're going to plan to do our ipcc protocol between the sp and the uh and the sp5 or the the turin processor over e-spy as well so that'll be a multiplex path and that was something that we had to solve regardless of east east by boot

0
💬 0

4256.739 - 4267.104 Adam Leventhal

as a former colleague of mine who retired was fond of saying, why do we have pins for Azalea on SP five? And they couldn't give me two pins to have a second flow controlled. You are.

0
💬 0

4268.104 - 4288.87 Bryan Cantrill

It was, yeah, we definitely, and I'm not sure, you know, maybe we're, I guess we're a bit unusual on this, but boy, we were unusual. We need the, yeah, that, So, George, we were using, just as Nathaniel mentioned, we were using one of the UARTs as the IPCC, which is the interprocessor, right? Communication channel.

0
💬 0

4290.391 - 4301.393 Bryan Cantrill

But this is our protocol for the socket, the host OS, to be able to speak to the SP, which is our replacement for the BMC.

0
💬 0

4302.353 - 4324.527 Adam Leventhal

Um, and we were specifically looking for something that, that we could use that didn't require PCIe training or very much, uh, from the peripheral space that kind of keeps us stuck in the FCH. Uh, even USB obviously requires a lot of smooth bring up and shenanigans. So that was out of the picture. Um, so we ended up with the UART and, uh, actually the AMD UARTs can go up to three megabaud.

0
💬 0

4324.867 - 4350.726 Adam Leventhal

which is more than you know well okay well okay so they actually they can't go up to three megabod by default they didn't we we actually we needed a uh and well the rs232 level translator to go from like the 3.3 volt to i don't know minus 12 plus 12 that could not do Three megabaud.

0
💬 0

4352.047 - 4373.786 Adam Leventhal

But we, the three megabaud ended up being very load bearing for us because, because during, well, because the, yes, when we wanted the, when we wanted the PS, when we were doing dim training and are doing dim margining and we were, the PSP is spewing output. It was just happily going at one, one, five, 200. Yes. And there was no token to change it to anything. Yeah. That's for real.

0
💬 0

4373.806 - 4376.629 Adam Leventhal

It's not to get a three megabaud. So it was very, it's very slow.

0
💬 0

4377.329 - 4401.294 Bryan Cantrill

It was very slow. And our friends at AMD, fortunately got us a, a fix to the PSP to operate at three megabod. And that was very, it was life changing for, for, I mean, I know for 30, 30 X makes a big difference for RFK. 30 X means when, when it's 30 minutes, that 30 X is like a real actionable human 30 X. Not all 30 Xs are the same.

0
💬 0

4401.394 - 4405.195 Bryan Cantrill

And when something takes 30 minutes, taking 30 X off of that is a, is a big deal.

0
💬 0

4406.295 - 4433.469 Eric Aasen

uh so the good good news is with with the e-spy i think we can get significantly faster than three meg too so it'll be interesting to see what that looks like in practice uh e-spy is a little weird it's like it's it's simplex so you can only transmit one direction you know at any one time uh but you can get you can do quad at 66 megahertz so we should be able to get uh something a little bit faster than three meg i think and that is yeah sorry george go ahead

0
💬 0

4434.572 - 4461.025 George Cozma

It's really funny that you guys are talking about like three megabaud and whatnot, because, so slight tangent. So I used to work, back when I was in college, I used to work at the on-campus observatory. And there was a data uplink from the observatory to the lab, which was about a mile and a half distance. It was still running 800 baud for some serial connection.

0
💬 0

4462.509 - 4464.71 Adam Leventhal

Oof, 800 baud.

0
💬 0

4465.17 - 4466.731 George Cozma

Yes, yes.

0
💬 0

4466.751 - 4468.572 Adam Leventhal

You could practically run a message up there faster.

0
💬 0

4470.013 - 4478.717 George Cozma

Yeah, but mind you, this was just a, all it was was basically just the go signal to start the power up for everything.

0
💬 0

4479.017 - 4481.458 Bryan Cantrill

Yeah, the go signal still takes several minutes to transmit.

0
💬 0

4482.599 - 4496.424 George Cozma

Yeah, so basically you would send the message and then you would either walk or drive up And by the time you got there, it was done. But it was like, and then, and then the way that you had to connect to all of it was through a, through a BBS.

0
💬 0

4497.884 - 4503.026 Bryan Cantrill

I'm not even joking. This sounds like a dream that I would have that I would describe to Adam.

0
💬 0

4503.106 - 4506.127 Robert Mustacchi

The system was built in like the 1980s. It was not updated until 2020. Yeah. Wow. Yeah.

0
💬 0

4514.25 - 4525.158 Bryan Cantrill

I'm sure. I would like to believe that original designers and Oxide and Friends listeners would be like, oh God, that was still in use? That was supposed to be for a weekend. That was not supposed to be. That was a temporary fix. Totally.

0
💬 0

4525.178 - 4529.622 Robert Mustacchi

Oh no, this was no temporary fix. It was designed like that. The... So...

0
💬 0

4531.936 - 4556.565 Bryan Cantrill

And so Nathaniel, maybe worth elaborating a little bit why the three megabod is so actionable for us beyond just the margining and the Mbis results, because this actually ends up becoming, because this is our conduit for the SP to talk to the host CPU, we use this in the recovery path. So like if you've got a system that can't talk to anything else, it's gonna load its image

0
💬 0

4557.225 - 4582.524 Eric Aasen

via that link and being able to go faster than three megabaud is going to be really really nice right yeah yeah i mean that's that's kind of the big thing i think the big hope here is that for cosmo when we do recovery we could potentially use the you know like i don't know i'm hoping to get it you know somewhere up in the 12 megabaud but it's going to depend on you know how busy we are doing other things on that link too because it's a shared resource so

0
💬 0

4583.704 - 4598.575 Aaron Hartwig

When we talk about recovery, think about like DFUing your phone or whatever. We use this during the manufacturing process. So if a server has kind of gone out to lunch in some way or we just want to wipe it clean, we're using this mechanism and going to 3 megabot.

0
💬 0

4599.116 - 4605.341 Eric Aasen

Yeah, and I think we're replacing the Spinar image basically over 3 megabot. So it's, you know, slow. Yeah.

0
💬 0

4605.641 - 4610.486 Bryan Cantrill

It takes a minute as the kids say, but it actually takes like, we're actually doing two different things, Nathaniel.

0
💬 0

4610.546 - 4614.59 Adam Leventhal

So we, we first are writing the spine or which actually goes much faster.

0
💬 0

4614.77 - 4615.031 Bryan Cantrill

Yes.

0
💬 0

4615.071 - 4636.871 Adam Leventhal

That part is quick. But then we basically, instead of sending the full M dot two image that we would boot from, which would be like a gig and basically be, you know, an eternity in that world. We have a slimmed down, basically kind of phase two image. So unlike a traditional BIOS where you're basically splitting up, you know, the BIOS is in your spy flash.

0
💬 0

4636.931 - 4657.988 Adam Leventhal

It's, it, it sits there and then kind of goes and pretends to reset the world back into 1970 after waking up and changing everything and, you know, turning on all the CPUs so it can turn them off again. Um, you know, we basically have a continuous operating system image. So basically, but we just kind of say, Hey, you find it half your like Ram disk, half your modules somewhere else.

0
💬 0

4658.248 - 4669.637 Adam Leventhal

So, um, we end up when we end up doing the recovery, we end up sending kind of a like slim down, just, just, you know, a measly a hundred megabytes over this, this small link. Um,

0
💬 0

4670.337 - 4698.521 Bryan Cantrill

And actually, and so George, in all honesty, like part of the rationale for this is to get us out of those moments of terror when you are flashing a bias and you have gotten often no recourse if that, if that goes sideways. And so this gets us out of that because we know that the system at the, at the absolute lowest layers of the system, we can get the system to, to be able to boot. And we,

0
💬 0

4699.822 - 4715.255 Bryan Cantrill

it gives us much more control over the reliability of the system, upgradability of the system, manageability of the system. That's how we're able to get... Oxide rack can arrive, power on, and get going and provision VMs in minutes instead of days, months, weeks, whatever.

0
💬 0

4715.275 - 4730.771 George Cozma

Speaking of sort of... Again, sort of a question to you guys, because this is stuff that I... I know a lot more about CPUs and GPUs than I do sort of the networking and the sort of lower level intricacies of all this.

0
💬 0

4731.311 - 4749.097 Bryan Cantrill

Well, I feel like this is like, we're like the sewer people and you, you know, like you get to, I mean, you've got this glorious palace in terms of the cores that have been built. And meanwhile, it's like the sewer people are happy about not being at three megabot. Like what's going on. No, it's, it's a big deal down here in the sewer.

0
💬 0

4750.077 - 4769.767 George Cozma

Yeah. But sort of, So I asked you this back when I was in San Francisco meeting you guys in person. What do you think of sort of the updates to OpenSeal and how that's been going to get rid of AGESA?

0
💬 0

4769.787 - 4792.396 Bryan Cantrill

Yeah, so we are all in favor. So we have been, and actually it was funny because I actually first heard Turin the code name Turin when it was accidentally blurted out on one of those OpenSeal calls. I'm like, okay, what is Turin? And I remember asking Robert, like, that's a city in Italy, so it must be the next thing, but we hadn't heard of it yet.

0
💬 0

4793.797 - 4817.588 Bryan Cantrill

And OpenSeal was going to intersect with Turin, which of course, when we were first hearing that, it's like, oh my God, that just feels like Buck Rogers. It's like in the year 2041. But of course Torrent is not here. And that work we are very, very supportive of. We are not actually using any of that because it's a different model.

0
💬 0

4817.708 - 4838.458 Bryan Cantrill

It's kind of going to, it's still a traditional model of a bootloader that's going to effectively make the system look like it's gone backwards or send the system backwards to boot a host operating system. And we've got this staged approach where we are running a single operating system the entire time.

0
💬 0

4839.418 - 4857.963 Bryan Cantrill

So it doesn't fit our model, but we're extremely supportive of it because we believe that we want these lowest levels of the system to be completely documented. And we want there to be room for many different approaches. And so I think that we're very supportive of OpenSo in that regard.

0
💬 0

4861.181 - 4889.338 George Cozma

Yeah, because I know when we last talked about OpenCell, it was very much in sort of the initial stage of it being ramped up and what was happening with it. It does seem like AMD is adopting more open standards with regards to sort of, because they also announced Calibra, which is open source root of trust stuff.

0
💬 0

4890.179 - 4915.232 Adam Leventhal

Yep. Yeah, so we're excited to see how all that kind of starts to change. I mean, there's a recent... I dropped a link in there. I think they did this at OSFC. They talked about how they're going to have OpenSeal kind of be the mainstay more so for Venice. Yes. And so I think, you know, from our perspective, this is all good. It kind of gets us out there. We can start to point to things that...

0
💬 0

4915.892 - 4934.106 Adam Leventhal

you know, are in open cell and ends up being a, a win for, uh, uh, for everyone. So I think it's, it's basically, it's excited. We're, we're excited to see it. Parts of it. Um, you know, we may be able to leverage directly, but if not, you know, we can be inspired by it. They can be inspired by us and vice versa.

0
💬 0

4934.606 - 4959.557 Bryan Cantrill

Yeah, and it's very nice to be able to go compare nodes, especially when things aren't working. Helpful to have multiple implementations out there. And I also think that the model for AGISA, the programming model, makes it very difficult to reason about the overall system. This is where Robert's eyes are going to start to twitch because...

0
💬 0

4961.818 - 4968.083 Bryan Cantrill

Robert spent a lot of time in the absence of documentation having to really understand what this code was doing.

0
💬 0

4968.103 - 4984.319 Adam Leventhal

I mean, I still think the best bit for me is in SP3 where the SMU to do hot plug, it speaks over I2C. And I'm pretty sure it, the smooth itself does not reset the I squared C peripheral to run at a hundred kilohertz.

0
💬 0

4984.92 - 5002.141 Adam Leventhal

When the, when the I squared C peripheral restarts, it starts in basically fast mode plus, which is this weird push pull mode at like faster than 101 megahertz, which basically means it don't work. Yeah. And the only there was definitely no explicit initialization.

0
💬 0

5002.161 - 5015.577 Adam Leventhal

It's just that, hey, this Dixie module in a, you know, dependent on this Dixie, which probably just did a generic blanket I squared C initialization, which changed, you know, which reset everything to 100 kilohertz.

0
💬 0

5015.797 - 5034.746 Bryan Cantrill

Well, and that's it. I mean, when you have these kind of, this is why it's so healthy to have different software ecosystems on the same hardware, because you don't want things to be working by accident. You want them to be, and it's, you want things to be well-documented and with well-committed abstractions. And failing that, it's good to have the software out there.

0
💬 0

5034.826 - 5041.589 Bryan Cantrill

So it's, no, it's been, George, it's been good. And we're, I think, excited to continue to see that.

0
💬 0

5042.834 - 5067.198 Bryan Cantrill

um the uh some of the other like lowest level i lowest level differences on turin um we you mentioned the dims for channel and we we kind of had a um a fork in the road in front of us in terms of two dims for channel two dpc versus one one dpc and there's a trade-off there to be made and robert what was the i mean

0
💬 0

5068.092 - 5091.572 Adam Leventhal

Yeah, so the big... So I think to help understand making the trade-off for DDR5, you kind of have to go back to DDR4. So when you have two DIMMs per channel, the way it works is that kind of in the channel, or you go back even further in time, you'll actually find platforms with three DIMMs per channel. you basically are daisy chaining the channel.

0
💬 0

5091.612 - 5116.044 Adam Leventhal

So the traces will literally go up to the first dim, then continue on to the second dim or to the third dim in those platforms. So just the presence of that second, of having two dims on there sometimes changes the SI. In DDR4, it often didn't. So if you only had one dim populate, you could still get the maximum memory speed. possible.

0
💬 0

5116.104 - 5131.011 Adam Leventhal

However, in DDR5, just the presence of two DIMMs per channel drops dramatically what maximum speed you can hit. And then if you actually make the mistake of populating it, then that drops the speed.

0
💬 0

5131.051 - 5159.086 George Cozma

Well, not just populating it, the fact of having that channel, right? So for Turret, it's 6,000 up to 6,400 with validation, but 6,000 with one DPC. 4,400 with two DPC, and then 5,200 if you're running one DPC in a two DPC board. So the fact of just having that second channel, you're losing a whole bunch of your memory clocks.

0
💬 0

5160.154 - 5166.117 Adam Leventhal

Yeah, then for us, the other big change is that from SP3 to SP5, you went from 8 channels to 12 channels.

0
💬 0

5166.437 - 5166.677 George Cozma

Yeah.

0
💬 0

5167.037 - 5178.362 Adam Leventhal

And so just for us, since we kind of have this half-width system, that's, I guess that's, what about, I don't know, one of the other, I don't know, Eric, do you remember what the width is on that?

0
💬 0

5178.382 - 5183.064

PCB is 10 inches wide. Yeah, 300 millimeters. A PCB is, yeah, 10 inches, so 300 millimeters-ish. Yeah.

0
💬 0

5186.005 - 5211.892 Adam Leventhal

So basically, we were in a place where you could fit 16 DIMM slots, but you weren't going to make 24 DIMM slots magically appear in the space of 16, not unless you got very creative. So we ended up saying, okay, between that and the fact that you now had 96 gig and 128 gig RDIMMs without going to 3DS, which means you can actually purchase them and pay for them without a lot of blood money, then...

0
💬 0

5213.212 - 5235.356 Adam Leventhal

or really you're not basically fighting against the GPUs and HBM, which means you can actually get them. Then that kind of, kind of put us down to, okay, well that, that, you know, you want the memory bandwidth. That's definitely one of the big values here. And the memory latency for a number of applications can definitely matter and you can still get to capacity in other ways.

0
💬 0

5235.616 - 5249.844 Adam Leventhal

So that, that's kind of, I ended up kind of going at a kind of 12 channel, one DPC kind of configuration and, Because we looked at saying, okay, was that better? The other option was, hey, eight channel, two DPC. And that just kind of seemed kind of the worst of all worlds.

0
💬 0

5250.224 - 5284.615 George Cozma

Yeah, I think that the 12 channel, one DPC move is probably the right move. I do like that AMD is giving the option for a two DPC setup with all 12 channels. But I could definitely see how people would really want Especially if in the future we go to, say, 16 memory channels. There's no way you're doing 2 DPC on that. Right. Yeah. Right? We're going to have to go to 1 DPC.

0
💬 0

5285.155 - 5305.41 George Cozma

Now, stuff like MR DIMMs can help with capacity and bring back that sort of 2-channel capacity or the capacity that 2 DIMMs per channel will get you. But... Yeah, I think the 2D PC has been, the writing has been on the wall for it for a long time now.

0
💬 0

5306.212 - 5322.578 Bryan Cantrill

When I think just in general, when we had a trade-off where we'd have to give up memory latency, we have always felt that memory latency is really important. You want to get maximum. You want to minimize memory latency, and you don't want to take a hit there.

0
💬 0

5322.999 - 5327.64 Adam Leventhal

Well, in the DDR4 world, where you went from 3200 to 2933, that was a very easy cost to pay. If you were telling me to go from 6400 to 6000...

0
💬 0

5333.476 - 5362.515 Bryan Cantrill

probably could make that you could probably make convince yourself that that is actually worthwhile but you know 5200 4400 that's a that's a that's a lot farther from 6000 it's a big yeah right it's a big big chunk to take out and in terms of mr dims because this is a domain where you know intel is still uh basically the all i mean they because we're pretty standard on mr dims right let me let me let me get on my soapbox for about 30

0
💬 0

5364.664 - 5378.11 George Cozma

Because Intel's MRDIMS on Granite Rapids is not the JDEC MRDIMS. They are different. It's essentially just MCRDIMS relabeled, which made me tear my hair out because they're not technically compatible standards.

0
💬 0

5378.651 - 5378.871 Bryan Cantrill

Yes.

0
💬 0

5380.072 - 5395.823 George Cozma

So I wanted to scream and shout and let it all out as the famous song goes, because that was utterly infuriating to me because you're saying you have MR Dems, but they're not really MR Dems, the Jadak spec.

0
💬 0

5396.663 - 5417.79 Bryan Cantrill

So, you know, Georgia, what I love about Oxide and Friends is when it comes to the soapbox of a dims being not being per JDEC standard. It's actually a line of a soapbox here at Oxide and Friends. There's actually this is this is the because this is a soapbox. Robert, you know, the soapbox you've been on the soapbox.

0
💬 0

5419.951 - 5426.193 Bryan Cantrill

And it's, yeah, it's frustrating, but they, but MRDMs, I think when the, when they are JDAX standard, right, because they will be.

0
💬 0

5426.213 - 5446.121 Adam Leventhal

Yeah, the MR part of it is there. I think you'll, then once the questions, you know, as that slowly enters the market and memory controller support and, you know, seeing the costs, you know, assuming you can get the cost not to be ridiculous because volume is definitely one of the big parts of the DRAM business.

0
💬 0

5446.801 - 5469.297 Adam Leventhal

But, you know, for us, because we have a platform that's not trying to scrunch everything into a 1U, you know, a higher dim just means a new thermoformed air flow shroud. Right. And that's pretty easy to go fit in. You know, for us, the added height is not a problem. For other platforms and other chassis, you could be kind of SOL.

0
💬 0

5469.537 - 5478.524 Bryan Cantrill

Yeah, well, I think that, you know, one of our kind of big revelations, and again, this is not due to us, I think the other hyperscalers done this as well, but that the...

0
💬 0

5479.364 - 5504.909 Bryan Cantrill

you actually, the way to have maximal density is not necessarily to have maximal physical density, that you want to open up some room for airflow and you can actually get higher density by having, by using a little bit more space and being, you know, where the rack is nine feet tall. And so we, you know, using some of that, trying to use some of that space to get higher density.

0
💬 0

5508.218 - 5521.331 Bryan Cantrill

Um, the, uh, actually, can you just talk about bacterial vias for a second, Eric, just because you mentioned in the chat, I don't know, George, do you know about bacterial vias? This is, this is truly amazing stuff.

0
💬 0

5521.351 - 5522.192 George Cozma

I don't.

0
💬 0

5522.692 - 5523.673

Yeah. Oh, yeah.

0
💬 0

5523.713 - 5523.913 Bryan Cantrill

Yeah.

0
💬 0

5526.501 - 5546.229

I'm sure somebody will put a link to what the definition is in the chat as well. But basically, whenever you design a circuit board, you have a circuit board is essentially a set of two-dimensional layers that are interconnected in a third dimension. So it's kind of like two and a half D. And so to interconnect between these layers, you have what are called vias.

0
💬 0

5546.969 - 5565.99

And these vias are, in their most fundamental form, just a tiny hole drilled in the board that's plated with copper that connects multiple layers together. Unfortunately, when you get server motherboards and these bigger, higher density things, you have to use more layers and they get thicker.

0
💬 0

5566.37 - 5586.815

And it turns out that when you run high enough speeds, the length of that via from the top of the board to the bottom matters. And so if you have a signal from the very top of the board going through a via to the very bottom of the board, you can make that via look kind of like a trace, like a wire to the signal, and it won't really notice it.

0
💬 0

5588.336 - 5617.891

However, if you go like from the top layer to one routing layer down, which is like layer three, skipping a ground layer, going from one to three, Great, fine. But then you have this via barrel, this wire, essentially, that's hanging off this trace that goes from layer three all the way down to the bottom. And that piece of wire looks a whole lot like a capacitor. If I remember right, my RF rate.

0
💬 0

5618.732 - 5645.432

And so it basically creates a stub that causes a resonance. And when it resonates, it sucks all the energy out of your signal. And it turns out that things like DDR5 have high enough frequencies in them to require backdrilling. And to give you an idea of when you need to do backdrilling, I've designed boards that run like 10 gigabit lanes, so 10 gigi on a single lane.

0
💬 0

5646.432 - 5672.623

That's a normal, you know, two millimeters thick kind of thing and didn't need backdrilling on it. And two millimeters thick is fairly standard for a server motherboard. 1.6 is like a commodity PCB. When you run like 28 gig, you have to back drill it. And certainly when you run higher than 28 gig, you have to back drill. So PCIe is 32 gig in Gen 5 now, so that has to be back drilled.

0
💬 0

5673.464 - 5697.434

But what's crazy is even DDR running at six gig per lane, 12, double data rate, whatever. The frequency isn't that high, but the frequency content is high enough, and because it's single-ended, those vias start adding up, especially when you're routing in the top layers. And so we now have to backdrill a whole ton more vias than we used to.

0
💬 0

5698.405 - 5717.772

It used to be just those super high speed lanes like PCIe and, you know, 100 gig Ethernet and stuff like that had to be back drilled. But now you're doing like thousands of these things. And it turns out back drilling a via is really hard. So what they do is they take and remove that stub that's left over by shoving a bigger drill bit in from the other side.

0
💬 0

5719.471 - 5738.808

So they literally drill it out twice, once from the top, plate it, and then put it back in the drill, and then drill it again from the bottom. And what's absolutely crazy about this is getting those two drill holes aligned to within like a thousandth of an inch or 25 microns. And they have to do that because otherwise they'll short things out on the rest of our board.

0
💬 0

5739.509 - 5742.272

And that takes a fabricator with very high skill.

0
💬 0

5743.252 - 5766.492 Bryan Cantrill

I just love the fact that we're going to take a drill to the underside of the board. It does feel like a Adam Leventhal PCB engineer kind of approach to this of like, here, hold on, pass me the drill. I'll take this. Exactly. I need a drill and a running start, and we'll take care of it. Yeah, I'll fix your signal integrity issue with my drill. I'll fix it real good with this here drill.

0
💬 0

5768.013 - 5782.783 Bryan Cantrill

But it is a total precision, and it is 25 microns. Just amazing, Eric, that this is. And then we've got simulation tools. I mean, how do we kind of figure out where this needs to be done? And this is…

0
💬 0

5784.389 - 5809.299

Yeah, so we use both ANSYS and ADS. ANSYS is our full 3D, full wave solver, and then ADS is used for most everything else. But basically, we take the board geometry or even just a theoretical VIA and put it into something like ANSYS HFSS, and we can simulate what the effect of that VIA design will be on our overall channel.

0
💬 0

5810.54 - 5831.951

And you can do it by just basically creating a fake channel in ADS and then putting in your extracted performance of your vias into that tool. And it'll tell you versus a perfectly, you know, a nice perfect transmission line, how good it is. And your goal is to get that via structure as perfect as a transmission line. So essentially the signal doesn't see it.

0
💬 0

5832.011 - 5834.913

It doesn't notice a difference when it goes through a via.

0
💬 0

5836.585 - 5861.184

and yeah that takes a lot of time and a lot of simulations well i was just gonna tweaking vias around like thousandths of an inch here and there and i was gonna ask like these f series parts may be actually relevant for you eric you and tom and yes yeah exactly the answer side of things we have it's a per core license and i think we have like a license for like 10 cores or something so like on my personal machine when i run ansys i got the uh

0
💬 0

5862.345 - 5872.213

I can't remember the part number offhand, but it's the 12-core variant of the AM5, the Zen, I think it's the Zen 4.

0
💬 0

5872.253 - 5875.496 George Cozma

So the 7900.

0
💬 0

5875.576 - 5892.441

Yeah, the 7900. So it's not the 3D cache one, but it's just the normal one. But that one will boost up to over 5 gigs. So idling right now, I'm at 5.2 gig. And that turns out to be really helpful when you're running... When you're running these simulations in ANSYS that are insanely single-threaded.

0
💬 0

5892.901 - 5908.877 George Cozma

The 9175F is a 16-core turn part up to 5 gigahertz. But it's 16 CCDs with one core per CCD. It's designed for EDA. It is. Absolutely.

0
💬 0

5909.618 - 5916.222 Bryan Cantrill

And I just love the fact that you got to think of like, you know, who wants that? Like that thing in the SKU stack. It's like, oh, the engineer that actually like.

0
💬 0

5916.823 - 5920.585 Robert Mustacchi

Exactly. It's all the EDA folks. It's all the EDA folks who are like.

0
💬 0

5920.625 - 5921.806

Put this thing under his desk.

0
💬 0

5922.307 - 5928.251 Bryan Cantrill

Exactly. So we got to get that SKU for Eric and Tom and the other folks that are running these simulations.

0
💬 0

5928.711 - 5937.033

I got one of those monster SKUs sitting around one of the 500 waters. I'm like, yeah, that's cool. It'll pull a lot of power. I want the five gig one, man.

0
💬 0

5937.913 - 5961.578 Bryan Cantrill

Yeah, we got to get that for you. Not a cheap part, by the way, but still. No. We had the ANSYS folks on. I remember we had the ANSYS along with Tom on talking about our use of simulation, which was another great episode. I really enjoyed talking to those folks. And you just learn about the physicality of the stuff just...

0
💬 0

5962.358 - 5982.464 Bryan Cantrill

blows me away and like i feel i mean i don't you feel bad that we end up running like our dumb software on top of this stuff at the end of the day i just feel like we're kind of yeah seriously well i've been taking all these gigahertz for granted too and just like the level of complexity underpinning this is is bananas it's like we back for this thing to run php it's a good look yes sorry yes

0
💬 0

5985.265 - 5987.968

Well, I can serve up cat videos on YouTube, Pastor.

0
💬 0

5988.008 - 6009.268 Bryan Cantrill

Totally, totally. But it's just amazing. And this part is a great part. You know, I think that we're, you know, I think, George, we were really excited to see, I mean, obviously your in-depth review was terrific. But I mean, I think, George, from your perspective, like this is a part that has really hit the mark in kind of like every dimension it feels like.

0
💬 0

6009.932 - 6030.8 George Cozma

Yeah, so the 9175F, I think for any EDA workloads, is sort of the torrent part for that. Then you have the 9575F, which to me feels like the drop-in replacement for all the OEMs for Genoa. You just take out all your Genoa chips and you put that in, and it's just...

0
💬 0

6032.109 - 6065.7 George Cozma

You get better ST, so single thread and sort of low thread count workload performance to the 9654, so that's the top end Genoa SKU, but basically just as good multi-thread performance at similar power pulling, similar power numbers. So to me, that feels like the drop in replacement And then the big boys, the 9965 and the 9755.

0
💬 0

6066.161 - 6082.662 George Cozma

Those are the top end big performance that the hyperscalers and all the... people who can use that power will grab. Right.

0
💬 0

6082.682 - 6101.854 Bryan Cantrill

And I think that we, we are going to be, I think, you know, another thing that we're looking at is partnering with Murata and getting for those folks that actually do want to go more than 15 KW for the rack, which was our original design target, but which felt very aggressive in 2019. But then, you know, I think it feels like Nvidia is like, that's like two GPUs now for you. Yeah.

0
💬 0

6102.455 - 6107.318 Bryan Cantrill

I can get that in for you. Exactly. I can get that to you.

0
💬 0

6107.338 - 6107.538 Robert Mustacchi

Yeah.

0
💬 0

6109.384 - 6131.686 Bryan Cantrill

we are, are, uh, we'll be for folks that can go be up above 15 KW for the rack. We'll be able to go do that. And, uh, it's, it's, it's again, it won't necessarily be quiet. Um, but we, uh, we think we're going to be able to air cool that. Um, and those, that's where you get that, you get that kind of seven X consolidation that AMD was talking about. Um, and I think that there's, yeah. Yeah.

0
💬 0

6132.654 - 6162.363 George Cozma

Speaking of GPUs, something that wasn't covered in the media, even by us, was that when AMD gave their turn presentation to the media, While AI was a big part, they didn't just... When asked about HPC and FP64, they're like, yeah, we're absolutely supporting that. Do not worry. And that was sort of a big relief on my shoulders because it was like, thank God you're not just talking about AI.

0
💬 0

6163.043 - 6176.81 George Cozma

Like, there's HPC going on here. There's more than just... like low data types, there's FP64 things happening, thankfully.

0
💬 0

6178.584 - 6203.945 Bryan Cantrill

Yeah, and I think that we're excited for the AI workloads too, and I think they're going to get a nice pop from AVX 512, certainly the 512-bit data path there, and you're going to see a lot of those. There are nice pops to be had, we think. But you're right, it was not just AI. There are, as it turns out, other... We also need the workloads to simulate the computer for the AI, as it turns out.

0
💬 0

6204.065 - 6204.726 Bryan Cantrill

Eric needs the...

0
💬 0

6207.168 - 6228.844 George Cozma

So, so yeah, the, the 9575F was, was targeted towards sort of the head node for AI CPU that, that was its, what AMD was targeting it as. But I honestly think that in a general compute sense, it's, it's sort of the all rounder in my opinion. Yeah. Um, so.

0
💬 0

6230.475 - 6250.014 Bryan Cantrill

Yeah, we think so too. And I think that, you know, I think unlike with our first gen where it was kind of every, we only had one SKU, the 7713P, we're going to allow for some flexibility for Oxide customers inside that SKU stack. We're excited to kind of extend that and then do some of the work around dynamic power control.

0
💬 0

6250.034 - 6268.547 Bryan Cantrill

We got a bunch of ideas on how, you know, we've got the right foundation to go with. actually manage power holistically across the rack and, and use some of the, there's all, we got a lot of stuff to go, a lot of knobs to turn. Um, and I think it's going to yield a pretty great product. I mean, the hats off to AMD for sticking the landing.

0
💬 0

6268.647 - 6276.752 Bryan Cantrill

I mean, we are definitely wedded to AMD in a lot of ways in terms of our lowest level of platform initialization and so on. So we're always relieved when they execute well.

0
💬 0

6277.552 - 6277.813 George Cozma

Yeah.

0
💬 0

6278.773 - 6280.694 Bryan Cantrill

Um, and great decision in 2019.

0
💬 0

6281.775 - 6298.368 George Cozma

Yeah. Yeah. Um, and sort of wrapping up with instinct because there was some, a lot of people were concerned about the APU chip, which I think you and I had talked about.

0
💬 0

6298.748 - 6306.351 Bryan Cantrill

Yeah, we talked about the APU. Yeah, boy, we're hitting all the sympathy cards here. Hold on. Hand me the other sympathy card now.

0
💬 0

6306.531 - 6334.203 George Cozma

So I think there was some misunderstanding or was misheard in what was said Um, because when I, when I went and I asked for clarification after, uh, after the presentation, what it sounded like was. They aren't making APS every generation right now because their customers see sort of the X use as the AI chip and the ACE uses the HPC chip.

0
💬 0

6334.904 - 6363.2 George Cozma

So all of the big hyperscalers are only looking at the X use. And it's like, when, when you have to fight between. The hyperscalers and this slightly more niche part, it's like, yeah, unfortunately that will win. But from what I was able to gather, they do see the APUs as the future for not just AI, but for HBC and moving forward. So they are continuing development. There's no ending going on.

0
💬 0

6363.64 - 6383.843 George Cozma

Just much like 3DVCache, That was announced that Turrent X is not coming. And it's because the cadences are different, and there's certain dials that you get to pick. so to speak.

0
💬 0

6384.784 - 6388.646 Bryan Cantrill

When you were asking about the APU, did the oxide people put you up to this?

0
💬 0

6389.627 - 6405.376 George Cozma

No, so this was something that I've been bugging them for a while about. And by the way, if you guys see an AMD MI300A dev kit come out, I'm going to claim some level of responsibility for that.

0
💬 0

6405.396 - 6412.421 Bryan Cantrill

That's great. We've asked for... Yeah, that'd be great. You will deserve responsibility for an MI300A dev kit. We'd love it.

0
💬 0

6413.297 - 6437.213 George Cozma

Yeah, I've been trying to get them to put that out and sell it on Newegg, like Ampere Computing sells their Altera Max bundle where it's the board and a CPU. I'm like, just sell that on Newegg for, I don't really care how much money, just have one.

0
💬 0

6439.229 - 6455.674 Bryan Cantrill

George, I don't know what we're going to do for you if an MI300A dev kit is for sale on Newegg, but we're going to do something very, very nice for you. I don't know when it's going to be yet. It is going to be... That would be great. We get George a whole wheel of his favorite cheese?

0
💬 0

6455.854 - 6457.455 Adam Leventhal

Yes, absolutely.

0
💬 0

6457.555 - 6479.228 Bryan Cantrill

That's right. For sure. But I totally agree. And yeah, we definitely noticed that the... You know, we have brought up APUs so frequently with them. I feel we've kind of overstayed our welcome with respect to that. Sort of like, you know what? We're going to let you guys say the next thing about APUs. We're going to stop telling you how much we love APUs and we'll let you do the next thing.

0
💬 0

6479.608 - 6493.719 Bryan Cantrill

Or we'll let George do it for us, which is great. Much more effective. Much more effective. Well, we'll say, yeah, again, if we get an MI300A dev kit on Newegg, it's going to be, we're going to have to do, it's going to be something spectacular. Yeah.

0
💬 0

6495.342 - 6511.289 George Cozma

But yeah, I've been pushing them to do that. And in general, I've been pushing them as much as it's within my ability to fix certain parts of their software stack. Rock them.

0
💬 0

6511.969 - 6513.11 Bryan Cantrill

Yes. Yeah.

0
💬 0

6513.13 - 6513.33 George Cozma

Yeah.

0
💬 0

6513.93 - 6525.433 Bryan Cantrill

Well, you know, and I think that you, I think, and just actually honestly with OpenSeal, I mean, I think one of the things that we really like about AMD is it's a company that doesn't like, it listens. I'd like they kind of, they know what the right direction is. It takes them a while to get there sometimes because it's,

0
💬 0

6526.153 - 6526.473 Robert Mustacchi

Yeah.

0
💬 0

6527.033 - 6552.144 Bryan Cantrill

It's a big vessel. And, but, you know, I mean, I remember when we were getting Naples stood up way back in the day and, you know, there was a lot that was still needed to be done. But you could see that like, okay, this is not a trajectory that's really interesting. And then with Rome, it's like, okay, this has just got a lot more interesting. And it was clear to us in 2019, Robert, that they were,

0
💬 0

6553.064 - 6569.169 Bryan Cantrill

they were on a trajectory or they were surpassing Intel effectively. And then obviously with Milan and, and with Genoa now turn, I mean, it's like, we've seen them like continue to execute, execute, execute. And so, yeah, let's go, let's keep Adam on the APU side.

0
💬 0

6569.409 - 6594.324 George Cozma

Yeah. On the APU side in, in just in general on the sort of getting there. Cause the perennial problem for AMD has been software. And to the best of my ability, I've been trying to get through to them that they need to have Rockham support on every single piece of AMD. Anything that has the AMD logo on it should run Rockham.

0
💬 0

6594.664 - 6595.845 Bryan Cantrill

Period.

0
💬 0

6595.965 - 6603.731 George Cozma

With the one exception maybe being consoles, because those are a special little thing. But that's a different argument over there.

0
💬 0

6604.231 - 6606.133 Robert Mustacchi

That's for Sony and Microsoft to take up.

0
💬 0

6607.535 - 6611.239 Bryan Cantrill

Absolutely. Yeah. No, love it. And I totally agree.

0
💬 0

6611.64 - 6622.593 George Cozma

And you know, we're, it helps not hearing it from, from, but if any AMD users are listening to this at, at super computing and at CES, I'm going to be harping on you guys.

0
💬 0

6623.43 - 6648.685 Bryan Cantrill

Yes. So we, we love the parts and, and, and we're going to be, we want you to be got some ideas for things to be even better, but touring is a great part. We're really excited about it. And George, thank you very much for, for joining us. It's been great to have. Thank you for having me. Oh, yeah, it's great to have the team. I mean, and Nathaniel and Eric here and Aaron as well.

0
💬 0

6648.766 - 6666 Bryan Cantrill

It's been, and obviously Robert here in the studio with me. And then Adam, of course, to correct my pronunciations and to inform me that my running start is not quite big enough on the back drilling. Um, the, um, you know, I think it's been, uh, we're really excited about our forthcoming.

0
💬 0

6666.44 - 6685.715 Bryan Cantrill

So I don't think it's, we're excited about is of course, you're going to be able to take the, taking a touring sled and putting it into a, an unused cubby in an, in a, an oxide rack that has Mulholland sleds and just have the whole thing just work. So we're really excited about that. Um, And onward, great part. And George, thanks again. Really, really appreciate it.

0
💬 0

6686.535 - 6690.075 Bryan Cantrill

And thank you all for joining us.

0
💬 0

6690.516 - 6691.476 Aaron Hartwig

Speaking of excitement.

0
💬 0

6691.876 - 6694.436 Bryan Cantrill

Speaking of excitement, we do have one very exciting announcement.

0
💬 0

6695.076 - 6707.999 Aaron Hartwig

That's right. Take it away. DTrace.conf, we're back. So DTrace.conf is our approximately quadrennial Olympics-like... Let's see, they need the Olympics theme in here.

0
💬 0

6708.299 - 6709.679 Bryan Cantrill

The Olympiad has arrived.

0
💬 0

6710.299 - 6738.018 Aaron Hartwig

Our last Olympiad, 2020, was- I'm sure there'll be no copyright violation on the YouTube video if we do that. So we started in 2008, did it in 12 and 16. We're excited for 2020, canceled, and now we're back. So we're going to put the link in the notes. Link is going to go out to the folks who are here live, but it is December 11th. coming right up. So it's going to be an unconference.

0
💬 0

6738.578 - 6756.466 Bryan Cantrill

It's going to be an unconference. If you were a DTrace user, you want to come hang out at Oxide. We are going to charge you for tickets. It's not going to charge you too much money, but we do have to charge you something. Otherwise, it'll be immediately consumed by teenagers. Teenagers will consume every ticket if we don't charge you anything. Um, very limited supply.

0
💬 0

6756.486 - 6764.474 Bryan Cantrill

Um, so, uh, hop in there if you're interested in joining us. Um, yeah, it's going to be out of my, I'm, I'm really excited for this. It's gonna be fun.

0
💬 0

6764.815 - 6772.803 Aaron Hartwig

Oh, it's gonna be great. I mean, I mean, I feel like it's rude for me to say it's my favorite conference, but I've always loved it. I've loved it. It's been terrific. I know.

0
💬 0

6773.724 - 6790.88 Bryan Cantrill

And it's going to, yeah, it's going to have a different complexion and flavor this year for sure. It's going to be a lot of fun. So I'm looking forward to it. That's for sure. That's right. I know Robert, you've got to, everyone is just like, okay, what do I need to get done now before I've got until December 11th to get said to get my, but we got a lot of things to talk about. So.

0
💬 0

6791.861 - 6800.508 Bryan Cantrill

It's going to be fun. So join us, detrace.com 2024. And I will not violate any more copyrights by humming.

0
💬 0

6800.768 - 6805.852 Aaron Hartwig

With your humming? I think we dodged the bullet on that one. I don't know that it was so recognizable.

0
💬 0

6806.092 - 6814.398 Bryan Cantrill

Exactly. Awesome. All right. Well, George, thanks again. Thank you, everybody. And yes, see you at detrace.com 2024.

0
💬 0
Comments

There are no comments yet.

Please log in to write the first comment.