From Algorithms to FPGA / ASIC Implementation with MATLAB and Simulink - MATLAB & Simulink
Video length is 57:05

From Algorithms to FPGA / ASIC Implementation with MATLAB and Simulink

Overview

Learn about generating HDL code from MATLAB® code and Simulink® models for FPGA and ASIC implementation. This session starts with a brief introduction to Model-Based Design and the hardware development workflow. A MathWorks engineer will then demonstrate the step-by-step process with HDL Coder™ to start from initial models, incorporate hardware-specific constructs, and generate Verilog and VHDL code for FPGAs and ASICs.

Highlights

  • Using Simulink and Model-Based Design for hardware development.
  • Converting floating point to fixed point for hardware implementation
  • Incorporating MATLAB code into HDL workflows using Simulink
  • Prototyping designs on FPGA and SoC development boards

Published: 19 May 2023

What I'd like to focus on today is kind of giving you an overview of the hardware development workflow using our tools and giving you a demonstration on how to take a hardware design from MATLAB behavioral code to Simulink, and then to all the way to generating code that allow you to synthesize and take it to your downstream to implementation, all right?

So I like to start with a little bit of background and motivation. High-level design and verification tools, it's a growing trend. So we see more and more people starting to adopt these kind of methodologies for the quota designs.

So in fact, 40% of FPGA teams today and 30% of ASIC teams are using some kind of high-level synthesis tool and verification tool. And then the MathWorks model-based design certainly one of them. And they're doing that to drive more efficiency in the design flow. And moreover, more than half of all FPGA/ASIC teams are planning to adopt these kind of tools for the future in different applications area, including signal processing, image processing, as well as AI and machine learning.

and that kind of reflects the fact that production chip design is difficult. It's hard to do. And despite best efforts in actually 68%-- so a majority of ASIC and FGPA projects are behind schedule with over half of the project time being spent on verification.

And in fact, despite all of this, we have a majority of ASIC projects requiring one or more re-spin and which is really expensive. And the vast majority, 84% of FPGA projects, are working on safety critical design or have non-trivial bugs escape into production.

So why is it so hard to design FPGA/ASIC and And it really is because of the fact that designs are getting more complex. They're getting bigger. And that requires a lot of different engineers and from different disciplines to kind of work together.

So a lot of times, in the traditional workflow, that means researchers, system engineers are creating specifications in documents or code that they kind of pass around from stage to stage. So specifications get passed to engineers that kind of implement them in different languages.

Eventually, hardware engineer takes over, create the HDL code. Verification engineers that need to verify that code and design engineers take that to hardware. And with a traditional workflow, it gets hard to scale as the design gets bigger. And that really results in kind of poor communications across teams-- key decisions being kind of made in silo without other teams knowing.

And that leads to a lot of the system level issues that you only find in late design stages. And the later you find a problem, the harder it is to kind of correct for that error. And this design also makes it hard to adapt changes to requirement, which happens all the time. And that all leads to the problem that we saw in the earlier slide-- things being behind schedule and bugs escaping into the final design.

So MathWorks, the model-based design tools that we have using MATLAB and Simulink really is meant to address that by providing a single environment where engineers can collaborate and work together. So your algorithm design, instead of just a static specification, now actually become what we call an executable specification that allows you to test and catch and fix errors as early as possible.

So you can kind of focusing on finding the best architecture that meets your design goal, generating HDL code-- that VHDL or Verilog code that matches the architecture. And if things change, you can very quickly go in and make the modifications to your model, re-simulate and regenerate a code. And that's kind of the focus of our tools.

So in fact, MathWorks makes a lot of code generation tools. We have it for C Code, for processor GPU. And in the FPGA and ASIC area, the main focus is the HDL Coder tool. So this is what you would use to kind of explore our architectures, find kind of the best architecture that meets your goal, optimizing analyzing your fixed point design, and creating a RTL code, as well as testbench for your design.

So HDL coder provides some 300 plus blocks that allows you to design-- make designs in different areas. And so HDL Coder that provide that in both Simulink, as well as additional IP products for applications in DSP, comms, LTE, computer vision, and more.

So you can create these designs in Simulink. You can do so in hardware as well, in MATLAB as well. So either from a pure MATLAB workflow or using kind of embedded medical within a Simulink design with a lot of different operations that you can use. And a lot of that allow you to generate readable, traceable, and rule-compliant VHDL and Verilog code that you can take to the next stage.

So with this methodology, what that allows you to do is really to adapt to changes very quickly. So, allow you to kind of go on a higher level, creating design on a higher level, and allow you to generate in code that you can rapidly prototype with FPGA or hardware boards and also re-target for different vendors, as well as maybe moving to your design to from FPGA Basic as well.

All right, so that's kind of getting to the meat of this presentation on what does that workflow look like. If you have some some MATLAB code that represents your algorithm, how does that-- how do you then take it to all the way to producing HDL code that you can put on your FPGA/ASIC target?

And here I like to kind take a step back on introducing the environment itself. So we have-- using HDL Coder, you can generate code actually either from MATLAB code or from Simulink model. But the focus of this presentation-- and also the vast majority of our customers do use a Simulink environment to generate HDL code, and I'll kind of explain a little bit why.

So with MATLAB, this is kind of the environment where you use to work with large data sets. So, vectors, frames of data, very quickly exploring the mathematics, visualizing plotting things to make sure that your algorithm is what you want to do.

And so that's great. But then when it comes to designing for hardware, similarly, actually is provide a lot of benefit in that MATLAB is you can easily do in MATLAB. Simulink kind of have the inherent representation of timing. As Simulink kind take us time step, it's very similar to your hardware clock driving the rising edge of your design. So the notion of timing is kind of one major reason why Simulink is a really good environment to create hardware designs in.

It also allows you to evaluate different kind of architecture, whether you have a low data rate design, something like audio where you have multiple clocks to reuse resources; or you have a really high throughput design, where you have to actually process multiple samples per clock to achieve giga sample per second type data rate. And then you can easily do that and more easily do that in Simulink in MATLAB, as well as kind of looking at the different data types that are propagating through your design.

For a typical algorithm running in floating point, you don't really need to consider that. But then once you're kind of getting ready to go to hardware, you have to convert these kind of fixed points and be able to very quickly see as your data go from stage to stage and what that data type looks like-- that really helps you to kind of make that analysis and decisions to create better fixed point for your design.

So if you are new to any of these environment, we do actually have free Onramp set that you can take to get your system up to speed on how to create design both MATLAB and Simulink. And I provided some links here.

So this is kind of what the workflow looks like going from MATLAB to Simulink. So most of our customers kind of start with MATLAB. So they have some algorithm with MATLAB that is frame-based, vector-based. And it's a perfectly good place to be. So that will serve as your golden reference, kind of that executable spec that we talked about to help you verify what you're going to build in Simulink.

So, sometimes you will also create what I like to call an implementation algorithm. So what that does is imagine you have a function like FFT, right? So in MATLAB, it takes no time. So you just pass the entire frame of data to FTT. It gives you the answer.

But for hardware architectures office, it doesn't work like that, right? You kind of have to stream in samples at the time. And so sometimes, you might want to create what we call kind of a more architecture reference that allows you to do an apple-to-apple comparison when you get into Simulink. So you might do that.

Before then jumping into Simulink and kind of realizing that architecture in Simulink, converting that architecture from floating point to fixed point and then finally generating HDL code and gradually optimizing the HDL code that comes out of it. So that's the design flow that we're going to walk through today.

And the example that we're going to be using is a pulse detector. So we'll be sending out a pulse or waveform over the air. When we receive it, it's kind of corrupted by noise. So then how do we recover that?

We'll be using a correlation filter and finding the peak that comes out of it to detect the waveform that we had sent out. And in fact, this example that I'm going to walk through today, it's kind of exists as a self-guided tutorial version. So after the seminar, if you're interested, download the example. And you can actually build from scratch from the MATLAB reference that we provide you to kind of step by step create that Simulink model that gradually evolve into fixed point and allow you to generate code. So that's something that you can do. But today, we'll be using sort of a finished version of that example to give a demonstration of what that workflow looks like.

So the very first step, we're going to be taking the MATLAB golden reference into a Simulink representation of that. And at the same time, we'll be kind of moving from a vector reference in MATLAB into what we call a streaming. So kind of scalar-based implementation into Simulink. And we'll be verifying that the Simulink model architecture matches the golden reference that we start with. So let me go ahead and get into MATLAB.

So over here, this is kind of the reference that of the MATLAB code that we'll be using. So it looks like there's a lot of code here, but really it's a lot of it is kind of test bench, The. Test stimulus generation plotting and stuff. The majority of the code that we have to implement in simulates actually-- we just three functions. And if you wanted to, you could actually have put it in a single line. So it's over here.

So we have some waveform that we are going to be trying to-- so the waveform or the pulse that we're trying to send out. So we create correlation filter coefficients that will allow us to look for that waveform.

So first, the first step is to kind of filter the design using these correlation metrics with the coefficients on the filter. And then we're going to be-- at the output of the filter, we'll then find the magnitude of the kind of the biggest-- of the magnitude of the output filter and then find the biggest output that comes out of it.

So really, just three functions-- filter, abs, and max, right? And if you kind of hover over the-- I already run it once-- if you kind of have it over the MATLAB code, you can see the input of the algorithm is-- in MATLAB, it is a vector. So a 5,000 data points coming into the filter that we then have to find the magnitude and the peak.

So when you're in MATLAB, right? When you have kind of the whole frame of signals or all 5,000, it's really easy to say, give me the magnitudes and find the peak of it. But in hardware, obviously, the data is not going to come in all 5,000 at the same time. And even if they are all available, you wouldn't want to work on like all 5,000 samples at the same time either.

You kind of want a more streaming operation to be able to kind of do the same thing. And that's kind of what we have on the bottom of this MATLAB code here is kind of a hardware-friendly implementation of that same peak finder. So rather than trying to get the magnitude and the peak of all 5,000 samples in one try, so what we do is instead, we'll kind of pick out a smaller sample buffers of it.

So as the data comes in, we buffer up just the latest 11 samples that have come through. So a vector of 11. And then what we do is we find the-- instead of the magnitude, that's kind of second step. Instead of the magnitude of the output of the filter, we just kind of do the magnitude squares.

Because the magnitude has-- to calculate, the actual magnitude has a square root operation, which is quite expensive to do in hardware. And for the purpose of just finding the peak of the filter output, we really do not need to take that square root. So in fact, we're going to be just doing the magnitude squared that doesn't involve the operation.

So the output of that magnitude squared then go into the sliding window. So as it comes through, we look at that 11 buffer samples and trying to find kind of the biggest peak within it by comparing the middle of that buffer to all of its neighbors. And if we find the biggest one, so it will tell us where find-- identify the waveform that we had send out, OK?

So when you run this test bench, every time you run it, you'll see that you actually kind of get a different signal. We've kind of randomized the location. And I think also the waveform that we send-- so every time you send it, it shows up at a different location. But regardless of where it actually is, after the signal pass through the filter, we're able to identify the peak. OK, so that is the algorithm in MATLAB.

So let's take a look at what that looks like in Simulink. So before I do that-- so this is kind of the test bench of that, that will run that model in Simulink. So in the test bench, we run the reference that we just saw, the pulse detector reference.

So just kind of create that reference filter, the output, and then kind of simulate the model programmatically. So you don't have to do that necessarily, but this is how this model is set up. So let me go ahead and open up the Simulink model.

So this is kind of the Simulink streaming version of the same pulse detector operations that we just saw in MATLAB. So you can see here the same kind of components to it, right? We have the RF signal that comes in. We have the filter at this filter block that represent that filter function that we saw in MATLAB.

This is the part where we compare the magni-- compute the magnitude square. So this is a complex signal that actually comes in. So to calculate the magnitude, we will take the split into the real and imaginary portions of the signal and then take the kind of the x square plus y square to find the magnitude square, OK?

So then in terms of the portion where we buffer up the 11 samples-- this is being done with this little block called tabulate. So the data comes in. I think if I turn on the single dimension, you'll be able to better see that we have a input that comes in. And then we have buffer up to 11 samples.

And after that, we go into this MATLAB function block to do the actual detection of that bigger sample. So the purpose of actually wanting to-- we do that on purpose to show you, even in the Simulink environment, you're able to use MATLAB code.

You don't necessarily have to just drag and drop blocks for simple operations-- things like control logic, if/else, switch case statements, stuff like that. It is easier to sometimes use code than to draw blocks. So you are able to use MATLAB code in a Simulink model by incorporating this MATLAB function block here.

But the advice that we give you always is the whole point of Simulink is to kind of do some of the mathematical operation that will allow you to visualize data types and other things and explore architectures. So we wouldn't recommend you to just dump the whole of your original MATLAB code into a MATLAB function block. That kind of defeats the purpose of coming to Simulink to begin with.

All right, so I do want to also point out one thing over here. So, previously, we said, you see these are signal is the same variable that we have defined in MATLAB. So in the MATLAB workspace, bring it over here. We have our RF signal.

I'll make this bigger. So we have the RF signal with that is a 5,000 sample vector. But what we're doing here, we're using this block from the DSP System Toolbox called signal from workspace.

So what this block does is you can define this signal. So this 5,000-vector, sample vector that we define here is, in fact, streamed into the subsequent block kind of one sample at a time. So we specify that how many samples we want to stream per kind of sample time or clock if you will, kind of one.

In here, we say we want one sample basically per sample time. If your algorithm wants, it requires you to process multiple samples. Per clock, you're able to set this number higher. But in this case, this is one sample at a time. So this original 5,000-sample vector will take 5,000 clocks to process through this algorithm.

So by using this block over here, you kind of already turn the operation from the frame-based reference that you have in MATLAB into a scalar-based implementation that is much more appropriate for FPGA and ASIC target, OK?

So let me go ahead and-- so this is kind of the first step, right? So moving from the frame-based MATLAB reference into a stream-based Simulink model. And I'll show you the testbench that I use to kind of run this model.

So over here, when I run this, again, the original-- every time I run, it kind of randomize. So what we're doing over here is we first run the reference to create what we need to compare to. We programmatically run the Simulink output, and then we are grabbing the output that came from Simulink by using these-- if you see over here, some of these signals have a little symbol hovering over it.

So these are what we call data logger. So we log these signals. And when we simulate the model, these log signal get automatically exported into the MATLAB workspace. And we can-- I have a little utility function over here to kind of extract out from the structure. We're using the name of the signal that I named in Simulink to kind of extract that back into a vector format.

And then I have another function over here to compare my filter output reference from my MATLAB code. And this is the output of that same order filter in Simulink, right? And I compare that and then plot that also. So this top plot over here. The top two, actually, it's called the output of the correlation filter. So the real and imaginary component of it-- sample-by-sample comparison between MATLAB and Simulink.

And you can see the error between them is either negative 17. So that really is floating point machine error. The output of the Simulink block is virtually identical to our MATLAB reference.

And then so I'm plotting that kind of in multiple stages. And again, that's a good idea when you're building a Simulink model. It's not just compare the final output of your model, but kind of allow you to plot and compare the individual-- the intermediate stages as well to help you verify that.

If you do find an issue, at what point does it start showing up, right? So we here we are comparing the output of the filter, the output of the magnitude squared, which is kind of that next plot over here; the generator of this third plot. Same amount of error, floating point machine error.

And then finally, the output of the peak detection is just kind of compared because it's just kind of a single point. You kind of compare that with just simple MATLAB code. You can see that the location is at the same place. The magnitude square output is-- again, the error is in the same floating point machine range, all right?

So this is that first Simulink model. And I also like to point out whether you find these blocks that you can use to create a Simulink mode. So in Simulink, you can open up the library browser. So I'd like to look at this outside.

So they look a little bit bigger. So with the Simulink library browser, it's kind of where you would find all the blocks that you can use to create a Simulink model. And a subset of that like we talked about before, a good number of blocks that we provide allow you to kind of build certain things, things in different application area.

So we do have a-- you can see over here underneath HDL Coder. If you kind of compare to the category between that and Simulink, you'll see that they have kind of all the same categories. And so one is underneath HDL code. Over here is kind of a subset of our base Simulink blocks that we generate HDL code for within the Simulink library.

So if you look at the, let's say, logical and bids operations, so these are kind of the-- underneath code are things some of the blocks that you can find for code generation, as well as underneath some of the other two boxes. So for example, we have DS-base HDL toolbox that have a lot of our things that like filters, transformer FLT, source block by NCO and stuff like that, and things in the comms area, as well as in other areas.

Now, if you are-- if you kind of just wanted to filter out and only see the blocks that you can use for HDL code generation, you can type something called HDLlib. So what that does is it will kind of filter out everything in this library browser to only show you the blocks that support HDL code generation.

So the first time, it takes a little time to try. So you can see here when I run this, a lot of things kind of disappear from the view. And then the blocks that you see now are only blocks that will support code generation, OK?

And then if you wanted to kind of go back to the original view, you can turn off the filter, and it will restore the library back to kind of the default view, OK? And I kind of like to keep it in this original view because there's a distinction of blocks that supports HDL code generation that you would use for the design.

But then at the test bench level, you can really use any blocks that you like to build up your test bench to drive the stimulus and kind of do the verification because we do not generate code per block for the test bench. We only kind of lock the data on your design interface, OK? And then when you are kind of getting more familiar with the environment-- so to the point where you don't really specifically have to come to the library browser to search for blocks, the easiest way to add things into Simulink is just kind of double click on the canvas and, say, start typing.

I'm looking for FIR filter. And you'll be kind of finding the block that supports-- that matches the search string you're looking for, and then clicking on it and bringing a block into your model like that. So that's kind of the fastest way to do that once you're more familiar with the environment, all right? So that's kind of the very first step, bringing your design into Simulink.

The next step of that is to start to kind of introduce more hardware constructs into your design, right? So things like, you can convert some of the model-level settings to settings that are more appropriate for HDL code generation using more hardware-efficient blocks and also adding in things like control signals. So a common one is something called data valet where you can toggle to actually represent data rates, as well as support kind of burst-type behavior that your algorithm might have, OK?

So let's take a look at the second version of that model. Go ahead and close this. So for every model that I have in here, there's a corresponding test bench that I use to simulate it and compare it to our original reference. So this is the second version of that Simulink model.

So you can see if I go into the subsystem over here, it looks a lot like what we have. We've added this data valid signal to kind of represent data rate or just indicate when a sample and a particular clock is valid, right? So the previous FIR filter that we had is replaced with another FIR filter. And this one, you can tell that it's created specifically for FPGA and ASIC implementation because you have this data invalid ports, as opposed to the previous one that only has the data port.

OK, so in addition to having the control ports, it'll allow you to express data rate and when the data sample is actually valid, when to ignore incoming sample. This block also has architectures that's created specifically to-- that's created specifically to optimize for an FPGA and FPGA target.

So we have different architectures like systolic, and as well as systolic with hardware resource sharing-- So DSP sharing-- that you can specify when you choose that and say, well, I actually have, let's say, 10 clock cycles in between a valid sample at least. So you can start to reuse some of the DSP resources that fully parallel a FIR filter that would otherwise kind take up a lot of resources.

So we replaced kind of this block over there. But then for the most part, other than the data and data valid and the FIR filter, the rest of the blocks is quite similar to what we had before, with the exception that we also-- and a couple of different things in addition to the replacement inside, right? So once you see over here is I kind of created this hierarchy in the model versus before.

The top level has all the blocks now. I have a subsystem that kind of represents our design in the test duty. So basically, everything within the subsystem are the things that will go into your FPGA, right? And everything around it is just a test bench. So what I talked about before-- blocks that are inside your DUT would need to be block side support HDL code generation, but then everything around here can be anything you like, OK?

And also, you notice that this model over here has like this red color that we didn't have before, and that's because we turn on-- we actually ran a function called HDL setup that will automatically configure settings for your model that is more appropriate for our implementation. And when we run that, part of it is to turn on the Simulink time card Everything kind of looks red, which is kind of the fastest way in the model; and in this case, would be the clock rate, OK?

So that's the second version of this model. And again, this model is pretty simple. It doesn't have a ton of architectural control-- microarchitecture things and control logic in it. But even though-- you can still run this with a very similar test bench that you had before. And when you run that, you will see the error of the output of this model compared back to our MATLAB reference.

It's kind of still that same range, right? E to the negative 16, 17 floating point machine error. And that's kind of one of the main, I think, benefit for this workflow is to be able to actually verify your architecture and control logic with the architecture model that is very close to being able to generate code for with the exception of quantization. And that really allows you to test and debug your design without quantization error obscuring any design problem that you might have, because a lot of times, if your design is kind of off by maybe like a sample in a frame, the error that shows up, it looks a lot like-- if you're just looking at the comparison, looks a lot like a quantization error.

And you won't be able to distinguish the two if you have quantization already in there, which is what you have to do if you're hand coding, where you kind of do the quantization and architecture kind of at the same step. You don't get to separate out the two. But with this environment, you can.

All right, so this is just kind of a picture of the FIR block that I talked about, this historic architecture that the FIR filter that's in the DSP HDL toolbox that offers. So it allows you to better map to the DSP resources that an FPGA have than the earlier FIR filter that block that we use in the first version of the model, OK?

So the next step-- like I said, we already have all the architecture in there. All we're really missing is quantization. So before I jump into the quantization of the model, I do want to make sure that we're kind of on the same page in terms of understanding of how-- what does the fixed point representation look like in MATLAB and Simulink.

So let's start out with something very simple-- what can three bits represent? So if you have three bits, you can represent eight numbers. It can be unsigned 0 to 7, or it can be signed numbers. You'll go from 0 to 3 and wraps around to negative 4 and kind of go back.

But then you don't really have to just work with integers. And one of the kind of the thing that MATLAB and Simulink provide is the kind of the notion of fractional bits and binary point, right? Just like decimal numbers that we work with. When the number gets bigger, it's really hard to start visualizing and conceptualizing what that represents. So having the ability to represent fractions really, really helps.

So when you introduce the concept of fraction length into the same kind of three-bit numbers, the number of-- well, the number of numbers that you can represent are the same. There's still eight numbers. It's just that the range reduce every time you kind of move your binary point over to the left by one space. You reduce the range-- of your the data range by two, but you also increase the precision that you can represent by two.

So instead of 0 to negative 3, negative 3 to 3, now we represent the 2, the 1.5 with the increment being 0.5. And we kind of can keep going. With a fractional length of 2, we'll go from negative 1 to 0.75 right and so on.

So I kind of stopped here. But even with a three-bit number, you can actually keep going. Your fraction name can, in fact, go bigger than the word length that you have. It really is just a scaling.

So to distinguish between kind of these different representation of fixed point in MATLAB and Simulink, we use three numbers. So the first number is the signedness. So either 0 or 1 is actually true or false, whether it's a sign number or unsigned number.

The second one is word length. So you can see here, we have three bits. So it's always three and in four of the columns. And the last one is the fraction length. So the fraction length the first two that we have for the integers will be 0 and 1 and 1 and so on.

So whether you're in MATLAB and Simulink, it really is just those three numbers that kind of determine the fixed point representation that you have. So in MATLAB, you would use something called phi, which allows you to define a first value, and then the fixed point representation of the signers, the word length, and the fraction length.

In Simulink, the function that you use is a little bit different. So you specify just the data type. You don't specify with the number. Number is actually it's kind of on this main tab over here. But it's the same three numbers-- 1, 18, 16 in this case.

So we have a lot more. There's a link here to talk about how to represent fixed point numbers and integers in MATLAB and Simulink. So you can take a look If you have time. But I do want to point out that phi is more than just-- so it's more than just a function. Actually, it creates an object.

So x is now an object that has additional functions that you can actually run. And it's useful to see that, well, at the end of the day, x is still just an 18-bit integer that we are storing. So what is that underlying integer? What is the range of this number that we have?

So the range, the epsilon, is really e with a fraction of 16 is really e to the negative 1/2 negative 16. So this is the smallest increment that we can represent with this data type. And when you multiply that store integer with the precision, that gets you back to the real-m world value that you have.

So this over here, 0.4473 is actually not-- it's actually not rounding error, but it's the reason why it's not 0.44725, which is much closer to what we have specified initially. It's just actually-- by default, MATLAB just show four digits.

You can, in fact, turn the format to show more digits, and you will be a lot more precise than 0.4473. And the range as well-- so you can see what is the minimum and maximum range that you can represent with this data type.

So it's always biased towards a negative and because we have to represent zero. So if we can represent negative 2 exactly with a particular data type, on the positive side would be like 1.9999 something, which is very close to 2, but not exactly 2.

All right, so when you start operating on fixed points-- so now that we're on the same page on, hopefully, kind of the fixed point representation of MATLAB and Simulink, once you start operating on it, this is kind of where the interesting things and where you need to do more work is to kind of managing the big growth.

So when you add two fixed point numbers, you grow a bit, right? Adding two four-bit numbers will result in a five-bit number. Multiplying two four-bit number, you will end up getting basically the sum of those two-- the work length that you're multiplying. So in this case, it's 4 plus 4, which is 8.

So when you are operating in full precision-- so growing bits, as many bits as you need to fully represent the output, that you don't lose any information with the growth. But then at the end of the day, you have to start trimming it back. YOU cannot keep increasing the work length as you go through your data path.

And that's kind of where a lot of the fixed point conversion work comes in, right? So in Simulink, the full precision, the fact that when you multiply two 18-bit numbers and giving you 36, that is automatically done for you with the block. So that's kind of the default setting.

It will just use the bits that is required, all the bits that is required to represent the number without overflowing right. So 18 by 18 will get you 36. 36 plus 36 would get you 37, right? But we have to trim back at some point.

So over here, this is a screenshot of actually that magnitude square block in the middle of our design. So the output of the filter is full precision for the bits. So we have to trim it back. And we decided to trim it back to 18 over here. Why?

And that's because a lot of the DSP blocks in the FPGA targets has 18-- at least one leg of the input has 18 as a kind of a boundary. So it might be like 18 by 18 or 18 by 25 by 27. Some of them is like 27 by 27 and others.

But 18 is kind of a safe number where it definitely fit in most of the DSP blocks that you want to target, regardless of vendors. So we pay 18. And the way that we do that is we are kind of doing it in a way that doesn't require us to lose any integer range.

So the integer range of this number is kind of the difference between the word length over here, 40, and the fractional bits like 37. So we have three bits of the integer range. So when I trim down the input over here from 40 to 18, I'm just kind of throwing bits away at the lower end. I then change the integer range.

So that guarantees that I won't overflow as I go from this data type for the bit data type to 18-bit over here. But that means that I lose precision. So then I do the same thing. I'll let the bit grow as it do the multiply and add 2 in order to leverage the resources on FPGA DSP block. But then at the output of that 37 bits, then, again, I trim back to 18, OK?

So then after that-- so this is kind of a good guess, right? Good starting point. But are we doing it efficiently? So we know that by choosing these data types, we won't overflow. But in fact, maybe it's not the most optimal data type that you can pick based on the numbers that are actually going through those blocks.

And in Simulink, that's actually pretty good way to find out if that is the case by using what we call the fixed point tool, OK? So let me go ahead and go in there and show you the step 3 of the model, which is the fixed point conversion.

Hey, Curie, while you're doing that, just want to kind of jump in here because there were a couple a number of questions related to this. Here's somebody from-- Jameel, he's asking, "Hey, we're using double versus fixed-- double versus single floating point versus fixed point. I just want to find out whether there's arbitrary bit width and a sign exponent fraction that supported with fixed." But also-- yeah, I guess he's asking about with floating point, too.

Yeah, yeah, yeah, good question. So I actually have-- let me just get this running. Give me a second. So good question, right?

So for the majority of design that we see people do is fixed point, maybe like 99% converting to fixed point and implementing-- in order to implement it on FPGA or ASIC. But actually, your code does support the generating code for floating point design. And that includes double, single, and half precision, which is somewhat new data type that's in the last few years, provided by MATLAB and Simulink. So we do have that.

Thank you.

So this is the fixed point model. You can see over here it look a lot like what we see before from the previous version, except that everything is now in fixed point. So like I said, 16-bit input going into the filter block. Full precision is 40-bit.

These are the kind of from the screenshot from the earlier slide where we kind of converting the data type down to a smaller wavelength before we carry on to the next step right. So to see that, you can see over here this test bench is just like the one that we had before. We run the model, compare the output of the Simulink blocks to the MATLAB reference.

And you can see here the error right now is definitely bigger than before. It's still a very small error, very good quantization error that we have here. But then, nonetheless, this is much bigger than what we saw before with the floating point machine error, right?

And we know that this is, for sure, just quantization issues and not any other design issues that we might have in the Simulink model because we've built the architecture entirely in floating point before we convert the fixed point.

So let's take a look at the fixed point tool. I'm going to bring up in the Apps menu over here the Fixed Point Tool. So this is a tool. So we've converted the fixed point. We want to see how well we've done by basically logging the data that's gone through the model and kind of visualizing, per block, what is the maximum and minimum value that has gone through a block; compare it to the fixed point data type that we've chosen; and see if we've done we've done well.

So it's run that simulation now. So what it does, it runs the model. It captures all of the data gets gone through the blocks and then kind of present it to you in a table and kind of histogram. So if you look at that, I'm browsing through. It shows you all the blocks really in the model, but I'm browsing through the hierarchy and just kind of looking at the blocks under the compute power subsystem, which is the magnitude square, right?

So if I click on any of these, if I double click on it, it brings me to the block in the model. So this is the block that entry is about. So you can also do kind of the reverse way. Right click on the block and say, fix 0.2 results. So that will bring you back from the model tool to this fix 0.2 results.

So let's take a look at the input over here, right? So we have data type converter block that converts our 40-bit input into 18, 15. One 18, 15, right? So with this data type, we can represent a range of negative 4 to almost 4 with this precision to the negative power of negative 15.

But in an actual simulation, the biggest and smallest data that has gone through the block is actually much quite a bit smaller than what we have specified. So you can see here-- better visualize actually the gray area in this histogram bar are the upper bits that never toggle, because the actual data is smaller. It's quite a bit smaller than the data type that we specify. So this might still be OK because your simulation might not be accurately reflecting the range of your actual data that will go through this block on hardware.

But let's take a look at the output, right? So the output would convert where we convert 37, 30 to 18, 11 you see here. This data range would represent negative 64 to almost 64. But the simulation of a-- I think is actually smaller than what we had previously.

So it's just between 0 to 0.055, right? So you can see over here the gray bar is a lot. Bigger a lot more bits are not toggling. So these are ways that resources. They don't do anything on hardware. If you don't require the precision, maybe you can just trim them away. Use a smaller data type, or allocate some of those 18 bits into the lower end so then you have a more precise numbers and less quantization error.

So this is kind of what fix 0.2 helps you do is kind of-- these are the kind of analysis that you have to do manually anyway and probably very tediously if you have to do it manually, but to help you kind of very quickly do the analysis and visualization with the information that it has, OK? So two has been around for a long time. Just most people actually don't realize it exists, too. It can help you make better fix point design choices.

So finally let's get into the last step, which is generating code. So the design at this point wit put in the necessary architecture for hardware we've quantized it in fix point so it's ready for code generation and in know I've already kind of run through generated code and run it through synthesis using what we call the Workflow Advisor.

So with the design over here, when you click on open up the HSL Coder tool strip on the top, which is underneath Apps HDL Coder. So that will open up. And you can click on what we call Workflow Advisor. That will kind of bring you into a guided workflow to both generate code and synthesize if you want to synthesize the connect to your downstream to Vivado or Quartus, which I saw someone ask and help you actually synthesize the design by running the two kind of behind-the-scenes batch mode.

And I've kind of already done that before the presentation to save a couple of minutes. So we've set the target to using Vivado, targeting a, I think 7,000 series device. I've created all the HDL code. In fact, and I'll just quickly show you.

And I'll also turn on what we call a traceability report. So that allows you to go between-- very quickly, going between model and code. So you can see as I navigate through the model here, the code on the side, it kind of points you to-- with the traceability report, it points you the code for the particular block.

So as I navigate through, let's say, the magnitude square, this is the code. This is what they will generate for the multiply, just this over here. And this error is this code, right? So this is kind of the tapped delay blocks, so stuff like that. So that's very helpful for when you have to review what kind of code did we generate, especially for kind of high-integrity workflow where you have to do review of the generated code.

And this allows you to also take it all the way down to synthesis. And if have chosen our evaluation board, it will actually give you-- drop the design into a larger reference design kind of like a donut hole and then give you the bitstream to program the board with.

But here I only run it through a synthesis, and this is kind of the post-synthesis resource and timing estimate, right? So the DSP usage 194 is actually high, but as expected because the FIR filter that we use is fully parallel and has 64 taps and it's complex.

So it's end up being 3 multiplied per tap. So 64 times 3 plus the two that we use to calculate the square. So exactly as we expected and once with a pretty good timing. So that's--

Hey, Curie. I know you're about to cover this stuff, but one thing coming that I think a couple people brought up is how what you showed here would differ if I was using Intel or Microchip or Lattice for that matter.

Yeah, great question. So the integration is very similar, right? So over here in the-- I picked the generic workflow where you can choose a synthesis tool. Right now, I only show one tool because I only met the path to one.

But we support the integration of Xilinx Vivado, Intel FPGA for this tool, as well as the Microchip tool as well. So you can pick the tool. And if you choose not the generic workflow, but if you choose, what we call, the IP code generation workflow, you can pick a number of supported evaluation platform to target as well. And we would address a lot more of that in the next seminar in two weeks, the targeting portion of it.

So that's just a quick summary slide over here to talk about it. We already mentioned we do support floating point code generation. And that is really useful if you have a portion of your algorithm that can benefit from floating point you know where you have to represent really high dynamic range that's difficult to represent in fixed point, for example, right?

And this is just a quick summary of what we have to show using workflow Advisor to help you kind of get to the next step-- generating code, taking the generator code into your downstream tool for additional steps, and kind of reviewing the results and trace between the code and the model.

This kind of address the earlier question as well is like, what's next, right? So it's more of a preview for what we will cover in the next seminar in two weeks? After you get the HDL code, what do you do for both simulation as well as targeting? How do you take the design?

And we actually support workflow where we-- for a number of evaluation platform, you can take the code, put it in the reference design. You can also target the processor as well if you have an associated device like Zinc. And put the design on the hardware and then at the same time connect it back to MATLAB to both visualize the data.

And if you want to tune parameters to kind of really test out the functionality of the algorithm in hardware in real time leveraging MATLAB. So that's covered in the next seminar. And that's really it. So we can maybe reserve a few minutes to answer some questions.

But hopefully, this was a good overview of what model-based design is about for FPGA and ASIC applications using our tools. And some just a handful of selected summaries from-- testimonials from our customers on how the tool helps them achieve better results with less time and less effort.

And kind of the additional resources-- we do have a web page where if you click on it over here-- I'll just do that real quick. Lots of resources over here that's linked to getting started resources videos, link to training that we've also linked over here.

So at MathWorks, we do provide a lot of additional services like the guided evaluations for our HDL tools. We do that. That's actually kind of the main job that I do day-to-day is to help customers prove out the tool and adopt it for their workflow. We have training courses, as well as consulting services as well.

And this is just part 1 of a series. And we have two more on the schedule. And there'll be more announced later on this year starting in July.

View more related videos