An Expert’s Guide to Using MATLAB and Simulink for FPGA and SoC Design - MATLAB & Simulink
Video Player is loading.
Current Time 0:00
Duration 54:00
Loaded: 0.31%
Stream Type LIVE
Remaining Time 54:00
 
1x
  • Chapters
  • descriptions off, selected
  • en (Main), selected
    Video length is 54:00

    An Expert’s Guide to Using MATLAB and Simulink for FPGA and SoC Design

    Overview

    An Expert’s Guide to Using MATLAB and Simulink for FPGA and SoC Design

    Join us to learn how Adam Taylor, noted expert in Embedded Systems and FPGAs, uses MATLAB and Simulink with C and HDL code generation to design and develop space and automotive applications. 

    Using Model-Based Design, Adam will show algorithm development through hardware implementation, demonstrating how to optimize implementations and reduce development time. Attendees will learn how to use modeling and simulation in MATLAB and Simulink to speed-up their development process by identifying design issues early in the design cycle, then using automatic C and HDL code generation from models to deploying designs to FPGA and SoC hardware. Adam will also cover how to verify that implementations satisfy project requirements using automated methods. 

    Highlights

    In this webinar, you will learn how to:

    • Apply Model-Based Design to modeling and simulation of digital designs for targeting FPGA and SoC hardware.
    • Analyze hardware designs and optimize latency, throughput, and resource usage.
    • Generate readable, synthesizable VHDL, Verilog, and SystemVerilog for FPGA and SoC implementation.
    • Deploy on FPGA and SoC hardware.
    • Test and verify hardware implementations.

    About the Presenters

    Adam Taylor 
    Founder and Principal Consultant, Adiuvo Engineering

    Adam Taylor is a chartered engineer and fellow of the Institute of Engineering and Technology based in Harlow, United Kingdom. Over his multi-decade career, he has had experience within the public and private sector, developing FPGA-based solutions for a range of applications including RADAR, nuclear reactors, satellites, cryptography, and image processing. Education is his deepest passion, and he has delivered thousands of hours' worth of training to corporate clients and casual hardware enthusiasts alike. 

    Adam is also the author of the MicroZed Chronicles a weekly blog upon FPGA / SoC Development.

    Stephan van Beek
    Application Engineering Specialist EMEA for FPGA/SoC/ASIC, MathWorks 

    Stephan van Beek brings over 15 years of experience at MathWorks, Eindhoven, as a technical specialist addressing the Systems Engineering and Embedded Systems (i.e., FPGA and SoC) landscape. Stephan works with customers across Europe to apply the principles of Model-Based Systems Engineering with Model-Based Design. 

    Prior to joining MathWorks, Stephan was a member of the electronic design methodology team at Océ-Nederland, where he focused on enhancing design workflows. His professional journey also includes a chapter at Anorad Europe BV, delving into the intricacies of motion control systems. Stephan earned his B.Sc. degree in Electrical Engineering from the Eindhoven University of Applied Sciences.

    Tom Richter 
    Application Engineering Specialist EMEA for FPGA/SoC/ASIC, MathWorks 

    Tom Richter joined MathWorks in Germany in 2011, where he worked for 10 years as a Training Engineer for Model-Based Design, signal processing, communications, and code generation. With a focus on ASIC and FPGA design, he developed MathWorks training courses for HDL code generation and hardware/software co-design. In 2021, Tom became an application engineer specializing in HDL and System-on-Chip applications, where he works with a variety of customers and applications.  

    Tom holds a Master of Engineering degree from the University of Ulster in Belfast and a Diploma of Electrical Engineering from the University of Applied Sciences in Augsburg.

    Recorded: 29 Oct 2024

    STEPHAN VAN BEEK: Hello. My name is Stephan van Beek. I'm an Application Engineer at the MathWorks. I've been working at MathWorks for now about 16 years, and I've been mostly working with customers in the domains of systems engineering, FPGA, and SoC design.

    Today, we will be talking about "An Expert's Guide to Using MATLAB and Simulink for FPGA and SoC Design." And I will be doing this together with Tom and Adam. So, Tom, could you maybe introduce yourself?

    TOM RICHTER: Thank you, Stephan. Yeah, my name is Tom Richter. I'm an Application Engineer in the German office since three years now. Before, I was 10 years as a training engineer for FPGA and SoC and ASIC design.

    Yeah, and I am now really looking forward for this webinar. And, Adam, could you maybe also introduce yourself? I have heard that you have a nice project to share with us.

    ADAM TAYLOR: Hello. I'm Adam Taylor. And I've been an FPGA Engineer for 24 years, and for the last 10 years now, I've run my own company, Adiuvo Engineering and Training.

    And today, I'm going to talk to you about a project that we've developed-- actually, two projects that we've developed using a model-based flow. Initially, I just want to set the groundwork a little bit on this, just so we can see what's going on. And a lot of our work, actually, at Adiuvo is done in the space and the defense and the high reliability arena.

    So when we work with FPGAs, we're typically working with FPGAs that will be targeted into space applications. So the devices that we use for these applications, yeah, they're a little bit more limited than what we see in the commercial environment, but they're getting much larger and much more capable. From the 300,000 gates of the Kintex to the nearly a million gates that's available in the Versal design, these things bring with a significant challenges to be able to design with them effectively and efficiently and, of course, deliver on time and on quality.

    Now, the problem with all this is that actually, if you take a look at the Wilson Group survey, we're not actually getting much better at designing FPGAs. The FPGAs are getting larger, but the methodologies that we use and the approaches that we're using, they're not helping us develop FPGAs that don't have issues in them or projects that go smoothly.

    So if we take a look, roughly about 84% of FPGAs, they leave into the field with some sort of nontrivial fault. And that leads on to the impacts of missing schedules, and that comes with additional pain. And certainly, for a little company like us, that comes with quite a lot of pain, actually, if we begin to miss-- not only if the schedule begins to move out, the design gets more complex, but actually, our payment milestones then move out, so that becomes quite a challenge to us.

    But obviously, we're not just using these in space. We're using these in commercial applications, industrial applications, aerospace, automotive applications. There's a huge world of FPGA developments that are being used for, and that's what we're-- and that's what we're using them for.

    So with this challenge in mind, and 20-odd years of working in FPGA design, we wanted to do something slightly differently over the last couple of years when we started to pick up some larger development contracts, and one of these things that we were keen to do was to move into using a more model-based flow. Now, actually, to get started with this, we used an internal-based architecting tool that we created.

    I'm not going to talk about it too much, but it was a SysML tool based around a common framework. But that allowed us, actually, to start with a system-level model of our FPGA. So we could start with a system-level model of the FPGA, but also of the software that it was going to talk to on the processor, and we could then explore those issues at the system level.

    One of the nice things about this approach that we developed was that we could then draw SysML diagrams of our FPGA architecture and our solution. So we could outline to the customer, at key review points such as system requirements review or preliminary design review, we could show to them the actual architecture of the FPGA, be it such things as the processing stages that was going to be included, the clock domains, the register interfaces. It was a nice way for doing this.

    This gave us quite a nice way as well for tracing back from the architecture to the requirements that the customer had placed on us, because we'd done this in a systems modeling tool. Now, to do this, actually, once we created our architecture, our customers had bought off onto the solution that we were presenting, we could then code-generate from that. So we spent a lot of time and effort generating scripts that would allow us to convert that SysML architecture into a VHDL description.

    And this is actually where it begins to get very interesting for us because-- our system and our flow is presented here. So we were using SysML to analyze the system, to capture the requirements, and to define the FPGA architecture. But actually, once we'd done that, all of the functionality, all of the complicated functionality, the behavior that we put into the design, that was added in by using MATLAB and Simulink and HDL Coder, plus some of the-- plus some of the toolboxes. So this gave us the ability to go away and have a complete model-based solution to our design.

    So a lot of our design was very highly computationally complex. It's doing image-- a lot of it was doing image processing, data processing-type applications. So the ability to create the model in MATLAB and Simulink, and quickly-- as we will see throughout the rest of this presentation as well-- but quickly simulate, see how it behaves, and then translate that into the hardware, became really important for us and gave us a lot of boost.

    In fact, I'm not being funny, actually, when I talk about these next two projects that we've got. I don't think I've ever seen projects come together quite as fast and work pretty much out of the box on day 1, throughout my entire career, to be honest. So this was our approach to do this, very heavily MATLAB Simulink to give us all of that great functionality that we wanted within the FPGA.

    So I'll talk to you a little bit about the two applications that we have, that we've done this for initially. So one of these was a space-based imaging application, in that it will be in orbit, it will be capturing images from an image sensor, and that image sensor would then be post-processed using the FPGA to reconstitute the image and to forward it on to the downstream storage in the satellite.

    There's an example there of the test equipment there, actually, on the bench for this. So this was targeting a PolarFire, a PolarFire FPGA, just a pure FPGA, no SoC. And everything that we did in here was used was developed using this MATLAB Simulink methodology.

    So from the complicated sensor interfacing all the way through to the more simple, shall we say, image processing algorithms, such as the dark frame subtraction, histograms, all of that processing chain was implemented in using MATLAB Simulink. The only real sort of hand-custom VHDL Verilog we wrote was a few little things around DMAs and FIFOs and such like for clock domain crossing.

    The other application, actually, that we've developed this for, we've developed a solution based around this for, is actually a satellite tracking application, and that, again, uses image processing techniques to be able to detect a satellite and track it in orbit for communication with it. Again, to do that, there was a lot of image processing work in there, so we're doing a lot of image recognition, image sensing. And all of that, again, was done using functionality that came from the MATLAB Simulink way.

    And this was a really great-- this is a really great benefit to us because it means to us that our model is the master, and we can quickly iterate and quickly try what-if scenarios, particularly when you're trying to fine-tune an algorithm like in the later case, where it's split between some of the algorithms being run on a processor in a Zynq MPSoC processor core and some being run in programmable logic. So it's very quick to iterate round and move elements of the algorithm from one aspect to the other to test those corner cases and see what we want to do with it.

    So as I was saying, our flow was predominantly based, actually, around Simulink. So there's a couple of examples here where we've created IP codes that we've used in these solutions. So one of them is using HDL Coder to create this, which is a framed subtraction module. So as an image frame is being captured from an image sensor, there is a reference frame that is stored within memory, in DDR4 memory.

    And this goes away, reads out that image on a line-by-line basis from the DDR4 memory as the image comes in, and it subtracts one frame from the other. So it subtracts the stored frame from the live frame as it comes in, and that allows us to remove artifacts and noise that might be inherent in the system from the frame and give us a better quality of frame at the output. Of course, you can turn it off, turn this off should you want to, and the data will just flow straight through from the input to the output.

    So that's one of the things we tried to do with these solutions as well, was to make them really configurable, such that you can configure the processing pipeline if you want to, if you can have it on or not. So whereas this one sits very much within the processing pipeline stage, where it's moving from the image sensor to frame subtraction, to frame correction, to yada, yada, yada, as it moves on through that image processing pipeline, the next one, the next block I wanted to talk about a little bit was our histogram block.

    So again, using Simulink, we developed a histogram module that would calculate the histogram for the pixels in the image. This one was used as part of our tracking algorithm. This time, we're working with a-- obviously, I should have said at the beginning, actually, these are grayscale images, by the way, that are working on both applications.

    So we're creating this histogram of pixels. And then once we've got that, we're sharing that information with the processor core in a Zynq MPSoC, such that the processing core can then begin to take decisions and configure the remainder of the processing chain depending upon what's in the results of the histogram.

    So, two really reasonably complex functionalities that were implemented relatively simply using the MATLAB and Simulink model. When it comes to this, we've been quite a big-- we've been quite big supporters and users of this. And like I said, I've found it really important to-- really beneficial actually, to us, to do this as a company, as a company owner.

    Now, one of the challenges that people always get when you look at code that's generated is there's always the question of, What does the output code look like? So here's a quick example of what the output code looks like when we've generated it. It's very nice and simple.

    It's human readable. It's not just complicated text that can't be read. It's good-quality code that drops out at the back end as well.

    So that really wraps up my little bit of the presentation of the projects, and we've got a lot more things to talk about as we go through here. So I'll pass it back to Stephan, and we'll go from there.

    STEPHAN VAN BEEK: Wow. Thank you for presenting these amazing projects, Adam. So if you would look back, was there anything you missed in the workflow, or was there anything you would do differently in that sense?

    ADAM TAYLOR: So I think one of the things that we would do differently, actually, was I would like to be able to trace the requirements back to an actual implementation somewhere in the system, so be able to trace not just the architectural level, but the final implementation, shall we say.

    STEPHAN VAN BEEK: Yeah, and I can imagine-- I mean, so you mentioned that you were working a lot in space, automotive. I mean, certification, it could be, let's say, a related topic, and in such case, traceability would be key, I would say.

    ADAM TAYLOR: Yeah. Exactly that, yeah.

    STEPHAN VAN BEEK: Yeah. So let's-- so basically, Tom and I worked also on our little project, and let's try to touch on some of these topics in that project. And what it is that we actually would like to address here is-- so indeed, as you said, we have our requirements. Based on these requirements, we build models, and these models could be really built in either MATLAB, Simulink, Stateflow.

    Could be a physical network. Could be a neural network. But the idea is that once you have a model, you can actually run your simulations for validation purposes, and with that, you would actually have bidirectional traceability.

    From that, we generate our code. Could be Verilog code, SystemVerilog code, or VHL code. Clearly, the code needs to be verified.

    Again, we will touch on that as well. And then from there on, we can either build our prototypes or production deployments on FPGA SoC hardware, or we may even want to go to ASIC. So this is kind of, in a nutshell, let's say, our vision on these type of workflows.

    So let's first see how this would work out from the context of simulation. So let me go to MATLAB. So this is the latest release of MATLAB.

    So Tom and I have been working together using Projects, using version control. So everything is in Git, so we can check in, check out models quite easily. So if we would open up the model that we have been working on, which is in a streaming FFT, basically, we can run our simulations and-- yeah, let's say the simulation gives us, let's say, an output, and based on this output, we can then assess whether this is kind of meeting our expectations or meeting our requirements or not.

    Now, the point here is we may want-- you need to establish traceability, right, and so for that purpose, we kind of imported our requirements-- in our case, from Excel, but we could have used also other systems like DOORS, Polarion, or Jama. Or anything that speaks ReqIF, we can import and export to and from.

    But here you see that we basically linked requirements to each component, and again, by selecting a component, you can see that it actually highlights the requirements and the Requirements view, and vice versa. I can select a requirement, and that highlights the component in the model, basically. So that basically provides traceability.

    So in this case, I have requirements that link to the intended functionality that I need to implement. So we need an FFT. We need to have a complex to magnitude transformation all in fixed-point. Then we need to do a dB conversion, so that's the part that we have chosen to do in floating point. And then the output, again, is based on the AXI-Stream inputs and outputs, and again, both based on fixed-point data types, basically.

    So this is kind of, let's say, a few requirements that we were taking as an input. And what is nice as well here is that in this Implemented column, it shows you the status of, allocation of each requirement to a particular part in the model. So it tells me something about, OK, which requirements have not yet been allocated and still needs to be worked upon, for instance?

    Now, you might note that, OK, so basically, if I just-- and let me just do this so we-- it basically already ran our simulations, so the model is displaying, showing me the expected behavior. So we actually turn this into code, and again, this type of traceability is essentially continuous towards the code as well. So the moment I select a component, this is our requirement, and we actually have the requirement information also present in the code.

    And in some cases, depending on the criticality of your application, you may need to have traceability from your requirements all the way down to the code that you have developed, regardless whether it's autogenerated or handwritten, basically. But since we created this link, this traceability is extended with by HDL code automatically.

    ADAM TAYLOR: That's--

    STEPHAN VAN BEEK: Yup?

    ADAM TAYLOR: I was going to say that's really interesting. I have one question for you, though, as we're looking through this. With your requirements, you have a requirement on the AXI-Stream interface, and I see on your model-- so how'd you make this-- how'd you make this become an AXI interface in the model?

    How do I tell it? Because that's really important to me. All of our interfaces tend to be, wherever possible--

    STEPHAN VAN BEEK: Yeah, I can imagine that, Adam. And so essentially, we kind of-- so the key to modeling here is it's abstraction. So I do not have physically the AXI interfaces in the model. But when I'm generating an IP core, I basically have the ability to assign interfaces to each one of these inputs and outputs.

    So for instance, for my inputs, I could specify this to be an AXI-Stream slave, for instance. And there's a data port, and there's a valid port. Same for the output-- those are AXI4-Stream masters. And there are other interfaces that I will talk about in a moment as well, but this is where I do the assignments.

    Now, at the moment, I basically-- and let me just open up the report. The moment when we generate code, it's not just code you're getting. You're getting a full traceability report.

    So again, there was already in-model code to model traceability. But this is kind of a view where I basically get more data, information about the code that we have generated and how we have generated it. In this case, we have the IP code generation report.

    This shows us some details on the IP code that we have created, the target platform that we have selected-- so we are targeting, in this case, it's an autoscale platform-- which reference design we're using-- so this is a reference design with AXI4-Stream interfaces, and there's further details on the interfaces of the model and how we have assigned them to more target-specific interfaces, like AXI-based interfaces, for instance.

    Another interesting piece that this model tells me something is about timing an area. In the end, your implementation needs to set-- it needs to fit in your device. You mentioned the PolarFire. So it needs to fit your target device. So here we get these high-level resource reports, so multipliers. And so this is still a generic report, so no target-specific resources, but this is already, let's say, something that tells me something about code efficiency.

    Likewise, since we already had part of the model which was done in floating-point, we also have a floating-point resource report. And in this case, we kind of have a mixture of so-called native floating-points-- so this is where we have generated IEEE-compliant, target-agnostic, floating-point HDL code. But we have combined it with some floating-point operators for which we have used the floating-point libraries that AMD has provided.

    And that's what you're seeing here. And this also tells me something about how many lookup tables and registers in this particular case this multiplier is using, for instance. So again, the resource usage is quite important here.

    And likewise, also, for critical path-- so here we basically have done a critical path estimation. So apparently, the critical path is in this log10 function, so 3 nanoseconds, which is pretty good. And that's not just this table, but it also provides the ability to actually highlight a critical path in the model.

    So here, this is our log function, and this is also where, yeah, basically, the critical path is captured inside this log function. But along this way, you actually see the entire implementation concerning with all of the pipeline delays, matching delays that are needed in this model, for instance.

    ADAM TAYLOR: Yeah, I was just going to ask about that, actually, about how you go away adding pipelines between components and making sure that the delays through the components are actually all matched and nicely aligned.

    STEPHAN VAN BEEK: Yeah, I can fully understand that, because you want to isolate from a timing perspective [AUDIO OUT] . that if you have a timing issue, you know where to fix the issues. So now, in this case, in these two subsystems, you can already see that we have included these pipeline delays, and this actually has been done by HDL code automatically.

    And basically, how this is done is, basically, if you just select a component, I can go to the HDL Property Inspector, and here, it provides me with all of the HDL Coder-specific options that I can use to configure how that particular subsystem needs to be implemented, like pipelining, for instance. So here I specified that I want to have one stage of pipelining at the input and one stage of pipelining at the output, and that's exactly the way how HDL Coder has taken this.

    Alternatively, you may want to physically put in delay blocks. Again, those would translate into registers as well. And in some cases, you may have to do this because you may want to incorporate the behavior of these delay blocks in your simulations. But again, there are many more options, like sharing options, that you could also specify at the subsystem level or at the component level, even.

    ADAM TAYLOR: So the sharing option's actually an interesting point as well, because, obviously, one of the things we want to do with FPGA design-- and you've touched on it a little bit in the user report-- is make sure that we use the most efficient resources going, that things get targeted correctly to sort of block RAMs, DSPs. Can you explain that to me a little bit?

    STEPHAN VAN BEEK: Yeah, sure. So sharing is one way, but that's kind of a target-independent way of saving resources, which is very useful if you have, let's say, a big gap between the clock rate and the data rate. If that would not be the case, then there are still other options that you could consider in at least ensuring a more efficient mapping to the FPGA resources.

    And let me go back to the report again. So one way, for instance, here-- and here, we have purposely chosen that wherever you can, use the AMD floating-point operators. And in most FPGAs, it would still use logic to implement these floating point operations. However, if you would be targeting those, for instance, first or last hard floating-point resources on board, then in such case, we would actually, through this, use those hard floating-point DSP resources.

    And for the other things, yeah, we try to generate coding patterns that are efficient or at least that complies to the coding rules from the synthesis tool so that mapping to FPGA resources could be done as quick as possible, like block RAM. Or, I mean, if you look in the resource report, we also using RAM blocks, and it can tell you that these RAM blocks will be mapped to block RAM or to UltraRAM depending on the chosen target technology.

    Let me see. Yeah, so maybe just a few things still on the IP core report-- so again, I said we generate an IP core. We have integrated the IP core automatically in a reference design for which we generated the bitstream. Now, of course, I can imagine-- and this is where you find more information in this report-- is that you just want to have the IP core, and you want to do the integration yourself.

    ADAM TAYLOR: Yeah, that was going to be my next question. How do we take this and how do I integrate it within, like, my Vivado project?

    STEPHAN VAN BEEK: Exactly. So, and this is basically, again, explained in this IP core report as well, because the zip file that kind of contains everything that is relevant for this IP core, and if you follow these steps, then you can pull this into your Vivado schematic, and connect it to the rest of your design and build a bitstream in that way, for instance. I mean, both are perfectly valid and supported workflows, basically.

    ADAM TAYLOR: So basically, this means that I can take this FFT now, then, and test it on the board using MATLAB?

    STEPHAN VAN BEEK: Absolutely. So it's good that you bring this up. So let me go back to my presentation, because, ideally, what I think you may want to do, so after you have generated your RTL code and your IP cores, you actually want to run some tests, actually, on the board. And, yeah, in my case, I would actually use MATLAB for automating the testing process of this.

    Now, let me go back to my model. So basically, along with code, we have also generated a so-called host interface script. And let me show you the script which we generated for this project, which, again, this is a mostly automatically generated script.

    And if I just step through it, as a first step, I just import some data, and I just plot the data. It's simply a few sine waves that you're looking at. But it's always good to know what you're kind of sending in.

    Then we create an object, and through this object, it gives me a way to connect with the FPGA boards, or the SoC board in this case. So let me just run this so it's not that-- so basically, what happens here is that we are running this, this function, and this function basically contains all of the details of the, actually, interface that we are using, like address offsets, frame lengths, et cetera. So that's basically what now has happened.

    And now I'm actually having an object that allows me to write and read to my AXI-Stream interfaces. Like, here I have this writePort command. I have a readPort command. So let me just run this.

    So we are writing one frame of data to the FPGA. It's being processed. So we run the FFT on it.

    We calculate the magnitude. We perform the dB conversion, and then we read the data back out. And this is now shown in this plot again.

    ADAM TAYLOR: So that looks really similar to the simulation model that we showed earlier on, and I know we'll talk about that in a little while.

    STEPHAN VAN BEEK: Exactly. Yeah.

    ADAM TAYLOR: But, so the interfaces that you're talking to here, they're just the top-level AXI interfaces?

    STEPHAN VAN BEEK: Mm-hmm.

    ADAM TAYLOR: So what if what if I want to access internal? What if I want to take a look at what's going on inside the FPGA in a little bit more detail than just looking at those, looking at those AXI streams?

    STEPHAN VAN BEEK: Yeah, so you further want to debug your model, because I can imagine that this may not show you immediately the expected behavior, and then you're always kind of ending up in these kind of debugging situations. Now, I already mentioned-- and let me go back to the model-- that we had other interfaces, so-called FPGA data capture interfaces. Now what we did is we basically put test points in the model.

    So again, those model internals, you can simply mark them with a test point, and test points, we can automatically generate HL dot output ports for them. So there's no need to wire these signals. They become explicit output ports. This is done by HDL Coder automatically. It kind of helps you to keep the model clean, which is good.

    And then, likewise, they become available to me as an interface that I can assign to a so-called FPGA data capture interface. Now, FPGA data capture, you could see this as something similar to what the FPGA vendors are also providing for debugging, except that it's closer, tighter integrated with the whole MathWorks tools. So here, I can further configure how I want to do this, the buffer size, whether I want to have condition logic, and so forth.

    And when I now generate a code or an IP core, I get this data capture app, basically. So it's really specific to the model. So here, again, it shows me how much samples we are capturing. I actually, in this case, want to set up a trigger because I only want to capture data once data, further data, arrives at the output. So I just set here my trigger.

    And let me run this first. So basically, I just say capture data. So it will wait until it receives the trigger. So let me just go through this once more.

    OK. So here, now we have captured one, again, frame of valid output data. So we're using the Logic Analyzer scope, to visualize it. But let me just change the visualizations a bit.

    So I want to make this a bit higher, like this. I want to change the radix to signed decimal, and I want to log this in an analog fashion. And again, here, this is the output of my system.

    And again, we're looking at the two sine wave-- at the two peaks that, again, look similar to what we're seeing here. But this is the output of both real and imaginary data, and this is the output of my complex to two magnitude conversion, basically.

    ADAM TAYLOR: OK. That's really interesting. That's really cool. I like that.

    STEPHAN VAN BEEK: So this, it basically gives me, let's say, the ability, I can put these test points to any place where see fit. I can change the sample depth as I see fit. So that's-- yeah.

    ADAM TAYLOR: So this is what we would do, then, if I was sort of an engineer, and I was developing my application. I'd put this on there. I'd test it. I'd fault-find it because, obviously, nothing ever works quite a hundred percent how you expect it to the first time around.

    But once we've done that, how do we then go about doing that? Because, obviously, we were talking about a lot of the things we do in the space, the automotive, they require a more formal verification, so how do we go about more formally verifying that the RTL code is correct for the algorithm that we've just asked it to do?

    STEPHAN VAN BEEK: So this is a very good question, Adam, and I think that Tom might have an answer to that question.

    TOM RICHTER: Yes, thank you, Stephan. That's exactly what I can do now. I can explain a little bit about the functional verification.

    So what Stephan was actually explaining about was more about prototyping and debugging using AXI Manager and Data Capture. And we also can do function verification by using HDL Verifier, by doing this with cosimulation, for example, something what I will show you, and also FPGA-in-the-loop. Other things which are possible are actually using SystemVerilog DPI component generation, or even UVM is possible. All of this is actually accessible through the ASIC Testbench, which you can install together with the HDL Verifier with the support package.

    OK, let me go back to the report, what Stephan was showing already before. And, yeah, so you might have seen, there are a couple of information here which we haven't seen yet-- for example, some warnings which are showing up, and some reports, like the delay balancing report, where you see pipelining delay, which is used here.

    Also, so Stephan was already showing the floating point report, and there we saw that we had not only native floating-point usage, so the IP from HDL Coder, also having AMD floating-point operators here. And all of this information might be important when you make the setup for your cosimulation, right, because you might need extra libraries for your AMD floating-point, you might need to consider the delay balancing for ignoring these bits-- or these samples. And you also might have to have some settings which are given you in these warnings. Like, for example, here, that the reset has to be done in a certain way, right, that is also because of the AMD floating-point IP, for example.

    Right, you could do all of these settings by doing this in the Configurations, for example, and you could also generate the testbenches, then, from the Simulink Canvas directly in the model. All of that is possible, but I would like to show you also a different way, because the good thing is, with MATLAB and Simulink and HDL Coder, you can also automate such code generation and verification tasks.

    For example, I was generating the code directly with a script. It's just a very simple command for doing this, right. Let me now switch to the next section, and then let's also generate the code simulation model by just doing this with one command line.

    So while it is now running, that might take a little bit of time, let me maybe explain these different settings I did. So here you see, for example, the settings for the reset length, that was this warning, what we got in the report, and also that we should ignore the first 44 samples. All of these settings, by the way, are then finally placed into the Testbench environment. It has nothing to do with the code which was generated before.

    Right, another important thing you should consider is, when you have AMD floating-point IP, you might, first of all, need to compile the library for your HDL simulator. In this case, we are using Questa, and we want to do this for a certain library from Vivado. OK. That was then achieved with such a command like this.

    OK. This path, again, needs to be passed to the Configurations as well. I did this here with a command as well, Simulation Library Path, and set this to the the Xilinx Questa library path. So you see already something is happening in the background.

    The cosimulation model has opened. Right, we have it here. And, yup, and so you can see here, it comes with two parts. The upper part is showing the original model, or actually, in more detail, it's the model which Stephan was showing before, the generated model.

    Why is that? Because this generated model was including also the pipeline delay, right, which was, yeah, included during the code generation report. So this is just, yeah, a behavior model, right, and this is giving you the extra latency as well.

    Then, here in the bottom, there was this cosimulation block automatically generated during this, yeah, cosimulation model generation process. And you have there a couple of settings for your HDL simulator which you could even change. And the next step you have to do is you have to open your QuestaSim.

    And that can be achieved by this button. Let me double-click it. So this might now take a little bit of time to open.

    ADAM TAYLOR: While that opens, Tom, I'll ask you a question about this, because this looks really interesting. So we've got the cosimulation, and we've got some other things that we're going to be showing. But how important is it to actually do this? Can I just not trust that the code generation is always going to be correct?

    TOM RICHTER: That's a very good question, actually, Adam. Yeah, that's often a question I got, even when did training in the past, So why should I actually do this? Now, one thing we saw, that we included here an extra IP from AMD, and we have not yet checked if that is going to work. That's one reason.

    Another reason, you actually, yeah, said it yourself already before. You explained that you worked with certification customers, and for certification, you have to do [AUDIO OUT] such a step for, yeah, showing that the code, what you have there, is working correctly, yeah, is according to the requirements. So that's why you cannot even get around that.

    ADAM TAYLOR: Yeah, that makes perfect sense to me.

    TOM RICHTER: Yeah. So now let's have a look here. Yeah, the QuestaSim has now settled a bit. As you can see, it is now saying "Ready for cosimulation."

    Now let's go back to the model. Yeah. As you can see here, this is the cosimulation model. We have launched the cosimulation-- or the simulator.

    So let's now run it. There are actually some screens in the background which are showing the data output. And as you can see here, right now, we compare the upper model with the lower model, which is the cosimulation.

    And, yeah, right now it looks good, right? So there's nothing, there's no change or no difference between the two different simulations. It looks pretty good--

    ADAM TAYLOR: Well, apart--

    TOM RICHTER: --yeah, because otherwise, you would see it's a--

    ADAM TAYLOR: Apart from the little peak-- apart from the little peak down the bottom, yeah?

    TOM RICHTER: Ah, OK. You are really picky, Adam. But you are right, yeah. So there was actually-- [LAUGHS] there was actually a mismatch.

    You are completely right. And let's maybe have a look into this a little closer. Why is this happening?

    So I'll zoom in a bit to this portion. OK. We have the nice tools here, cursor measurement tools, which I can switch on. And I will just place these two cursors on it, and the second cursor, let me just set this to the other line.

    And you can see that there is a difference between these two different signals is 1.953 e minus 3. So that means it's a very small value, first of all. You might think, Where's this coming from?

    Now, you must know that in this block where we do the last part, there are a lot of floating-point IP, right, but finally, we actually go from floating point also to fixed point again. Now, the floating point IP is also simulated-- or, let's say, it's behaving very slightly differently on different processor architectures.

    So that's why there might be a very, very small change, but not as big as this one. This, however, happens when it later comes to the conversion, to the fixed-point conversion. Right, so that's actually-- the precision of the fixed point with 9 bits-- you can see this here-- with 9 fraction bits, that's actually precision of what we have there.

    Now you might think how to deal with that, right? And I actually have prepared a cosimulation model which is actually doing this. Just a second. It's this model here.

    And what I was doing, I was actually putting inside this comparison of the data an extra subsystem, which is not just comparing the data, if it's exactly the same, which is exactly-- what it's actually checking for, the tolerance, what we have here, right, 3 minus 3, which is a bit more than what we saw. That's now checked. So if you run this model, you will actually see no peak anymore.

    Right, it really depends what is your requirements and so on. On top of that, you can also, of course, make it a bit more accurate in general, using, for example, different rounding methods like Nearest. But consider that it's also taking extra hardware resources.

    Yeah. You can also use more bits for the fraction, finally, which also creates a bit more hardware, then, finally. But, yeah, these are things you have to deal with, right?

    ADAM TAYLOR: Yeah. Yeah, there's always those little errors and quantization issues that you have to work with as an engineer. It's why we get paid the big bucks, right?

    TOM RICHTER: Exactly. Exactly.

    ADAM TAYLOR: So this is really cool. I mean, it's fantastically interesting. And it's something I have to actually-- I should admit, something, actually, I need to spend more time learning. A lot of the projects we've talked about, actually, were developed by my team.

    So how do I go about learning more to do this? How do I go about doing that? Are there some training resources that you can point me to, or do I just open up MATLAB and start experimenting, which is probably not the right way to do it?

    TOM RICHTER: Yes, definitely. I will show you this later, but you might maybe also think of, Is there maybe another way for doing this verification?

    So we just showed now the verification in software, right? And you might think of, Is it also possible to do it directly on hardware, maybe even accelerating this with hardware simulation? And the answer is actually yes.

    And the good thing is, with model-based design, you can actually reuse also the Testbench environments you already have. So for example, this cosimulation model can be reused. Right, so I have it prepared already. It's a model where I just deleted the cosimulation block, so this one here.

    And so the final thing is we have to bring in here a block very similar to the cosimulation block, but instead of doing the cosimulation, doing it on the FPGA. When you go to the HDL Verifier tab and you go to FPGA-in-the-Loop, then you actually have the Import HDL Files button here. I can click on that.

    And then we can then select what we want to do. We want to do this for certain board. This is the development board we test it on.

    We want to do it with the USB 3.0 Ethernet. And then-- we can go to next-- then we have to add the files which we were already checking with the cosimulation. So these are inside this folder. There are a lot of them.

    So let me select them. And let's get them. But don't forget, there is another file we have to add, right?

    Before I do that, let me also say, What is the top-level entity? That's this one here. And that's another file, because, remember, we had this AMD floating-point IP, and now we have to do-- we have to make sure that we add, actually, the correct file.

    OK. So let me go inside here. And there are some VHDL files. So this VHDL file was used for the simulation. That is not the one we can use for the implementation, however.

    So now we implement this on the board. That's why we have to find a different file, and that's actually this XCI file, which is an encrypted file from AMD. OK. So I have to select this one and have to add to these files.

    So now I can go to Next. Then we can see there are a lot of signals, clocks, resets. These are actually already correct. Also, the reset, third level, and so on, all of that, it fits. So I can go to Next.

    But then we have to explain how to interpret these different types, or different bits and so on, coming out there. So this one could be a Boolean, and also, the last one, the validout can be a Boolean. The data, however, is a fixed-point, but in our case, it should be interpreted as a signed one with 9 fraction bits.

    ADAM TAYLOR: Yeah?

    TOM RICHTER: Yeah. We can just define that here. And we go to Next, and the final thing we have to do, we have to click on Build. But you work with FPGAs that--

    ADAM TAYLOR: We don't have time for that.

    TOM RICHTER: You know how long it can take until you're doing a synthesis. Right. So that's why I already have one. So this is actually what you get. And from the process, you get this block, and you get also the bitstream file in your code generation folder.

    Now, I did some extra step here for this model, and I want to tell you this. So inside this system, there was just a line here before. What I did now was I was including there such a buffer block.

    Now, this buffer block has a certain reason. So with that one, we go from sample-based to frame-based, and this accelerates the simulation a lot, because if you just provide one sample at a time to the different port, it just takes too long, right. That's why we do that here. And on the other hand, we have to unbuffer it again, and we have to make sure that the data coming from the other model has the same delay.

    So this is the delay. This is the frame size we have used. OK. And once you have that, the only thing you have to do is go into the HDL Verifier tab, go into Tool again, and when you have the bitstream, you will also see, then, the Load Bitstream button.

    OK. I can click on that and then bring it to the board. I have done it already. So the board is already set up, and the only thing we have to do is now running the system.

    And you can see that was pretty quick. Compared to the cosimulation we did before, that was really quick, so, yeah. And let's have a look to these scopes again, to these comparison scopes, if I find them.

    Here it is. It's a bit hidden. Just a sec. There it is. Yeah.

    So you can already see that there is a big latency. That's because of the frame-based processing. And we did this for both, and there is no mismatch anymore because we also have here this-- we had to check for the two point tolerance. OK.

    So, and with that, I would say, yeah, we have shown everything. We showed how we can do this on software and how we can do this on hardware.

    ADAM TAYLOR: That's excellent. Thank you. Thank you very much. I'm going to come back to my question that I asked earlier on, prematurely. So how can I learn how to-- how can I learn more about doing this process?

    TOM RICHTER: Excellent. So let me come there. So what we saw, what Stephan was providing here was, yeah, the way how to get from your requirements to the model and then, finally, generate code from that and implement this on some prototyping hardware. We saw that we can verify the functional verification, do it even on software, on hardware. And finally, we could then even implement this later on a production hardware like an ASIC using more or less the same code, maybe with some additions, right.

    And coming to the, Where can I learn more? now we have, first of all, some free onramps, which are available on our home page, for learning how to work with MATLAB, Simulink, and so on, but also other products, devo products, with Stateflow and so on. So you can find it there.

    For HDL, we also have some stuff. First of all, we have a training curriculum here, and we would suggest for, yeah, when you really start from scratch doing these MATLAB Simulink trainings and so on. And then, finally, having the trainings for HDL code generation with Simulink, we have one with DSP, for FPGA, even for integrating this on some development board, or even, here, radio frequency, SoCs, and so on. So that's something you can get more information, and of course, if you are interested in this, please reach out to us.

    For HDL Coder, we also have, let's say, a guide which helps you a little bit for making not too many mistakes when you generate your code, so get efficient code and so on. That is given with this HDL Coder Evaluation Reference Guide.

    We have an HDL Self-Guided Tutorial which you can go through, also available on GitHub, on File Exchange, and through the MATLAB editions. And finally, we have other resources like the ones in the documentation. For example, the product pages, we have user stories available and a lot of extra product pages for vertical-domain products when it comes to this.

    ADAM TAYLOR: Yeah, a wide range of support, then.

    TOM RICHTER: Exactly. Exactly. Now, I hope you have enjoyed this and you are inspired and maybe even want to use it. So please reach out to us. If you have further questions, now we would like to start our Q&A.