Building a Cloud-Based Digital Twin for an EV Battery Pack
Creating, validating, and correlating the model of a physical asset is important to building a digital twin, but modeling is only one aspect of the overall process of developing and deploying digital twins. In this presentation, we showcase a project from developing the model of an EV battery, deploying it to the cloud and connecting it to the data infrastructure, and predicting battery state of health based on data from a real-world electric vehicle fleet. Join us to learn about key considerations when planning your digital twin project.
Published: 21 May 2023
So digital twin is a major trend, and it's a new way of thinking and working. Where we used to sell physical things, we now see the sale of uptime or support. Where we used to see revenue from one-off sales, we're now seeing a shift towards subscription-based pricing models.
And finally, where we once saw products that had a fixed functionality, they're now evolving to functionality that can be expanded or added onto. Digital twin adoption is a journey, and it's all about the development of digital products and services.
My name is Will Wilson, and I'm an application engineer based in the Novi, Michigan office. And I work with our automotive and off-highway customers on all sorts of MATLAB and MATLAB-based analytics. And I'm very excited to share with you some ideas and some findings about a project that my colleagues and I have been working on with regard to an EV battery pack digital twin.
I'd like to begin with the end in mind and show you what the end result of one of these workflows might look like. This is a MATLAB-based app that I've deployed as a web app-- so we're sharing this app as a web page-- to talk about the current condition of a real system in the field.
So the first page talks about high-level condition. Good, bad, some information about throughput, maybe state of health. And then we can dive deeper into understanding about our system, maybe looking at some of the time series. Current versus time voltage versus time over a set period.
Maybe dive further and look at things like time at level because we know that operating at state of charge at the extremes could perhaps lead to damage. From there, we might want to understand more about usage, maybe combine signals like voltage and current and state of charge and see how they interplay and interact over time. Maybe just some operational metrics. How much time do we spend charging or discharging? Or maybe we have some error state that we need to dive deeper into.
We can go further into labeling our data and understanding how that data is labeled, which can be really important when we're developing models. We can deep-dive things like state of charge during different regions of interest and understand that distribution. And finally, generate some sort of model to estimate, in this case, state of health and see how that changes over time and temperature.
So this is one example of what I'm calling a digital twin to show us how our system is behaving in the field. Now to stay on track, I've broken our talk into three parts. We're going to have some background information, we'll try to spend most of our time talking about our project, and then I've got some ideas for next steps for you.
So whenever someone says, hey, Will, I want to talk about digital twins, the first thing I say is, what do you mean by that? And so let's have a definition. Let's use this for our conversation today. So digital twin is a representation of an asset. Typically a physical thing, could be a process. It's used to support decision-making, and that's the most important thing, and it's connected to data. So we'll use that as our jumping-off point.
Now what do people actually use digital twins for? All sorts of interesting applications. There's no one right way to do it, no one way to do it. But some of the more common ones We. See is the idea of a virtual sensing or maybe some predictive maintenance-type algorithm or condition monitoring. And we see this throughout a lot of different industries. So when we talk about digital twins, how are people using them?
Now because I do a lot of work in analytics, oftentimes I talk to people about workflows. Where are we going to start, where are we going to end? And this workflow, although it's four main steps, it's very useful in guiding our discussion. We start with data access. Where does our data live? How are we going to touch it? We talk about pre-processing that data. Missing values, outliers, all the stuff that goes into that.
Typically, people want to jump right into modeling, and I just want to talk about the model, which is great, but the two previous steps are non-trivial and are very important. And then the last step in the analytics workflow is where we bring value to our work by sharing it with the outside world.
Today, I showed you an example of a dashboard you could build, but this could also be software you deliver to a downstream team, or reports you make. So keep this framework in mind because you'll see this as a major theme throughout the talk.
Now when we talk about modeling, it's important to recognize that there is a spectrum of modeling techniques, from physics-based models all the way to AI models. And oftentimes, we see a mixture of these. There's no one right way to do this. It's a function of what you have and what you know. What you have in terms of data-- do you have any data? And what do you know how to do? Based on that, you can get to a strategy and talk about modeling techniques.
One of the key things that we had to do in our project was figure out how to take real-world log data and label it. And these labels are basically constant power charging in orange-- I guess in-- where it says Constant Power Charging. The quick charge events in yellow, and then purple is just normal usage.
And this was really important to be able to identify this so that we could apply our calculations and our estimations during those time zones of interest. Now the picture is nice and I hope it proves the point in context, but where I really go with this is a tabular representation of the picture I just showed you. We heavily use MATLAB tables and timetables to keep track of the information that we pull out from these signals. It's labeled information.
And this also gives us a jumping off point, if we choose to, go down the AI path, because now we've got a mechanism to label data and it gets us an opportunity to do that.
So let's talk more about our actual project now that we've got some baseline understanding. So our project was given about a year and a half worth of real log data for a fully electric material-handling machine.
So a real machine in the world. Take that data, design an engineering pipeline to go from the raw logger files into something that's analytics-friendly, develop some sort of model that could estimate state of health in this example, and then deploy that out into the world-- in our case, a dashboard. One of the salient details is that the anode material for this particular machine is lithium titanate, which is a pretty robust anode.
So to begin with, again, keeping in mind our analytics workflow, we had these MF4 files, raw logger files, in an S3 bucket. That's how it was presented to us. And we use MATLAB to grab a couple of files, bring them down from S3 to our local machine, open them up, take a look at them, CAN decode them, get a feel for what was there. Just scope the problem out, see what was there-- kick the tires, if you will.
And then we built out a full-blown data engineering pipeline that went from the raw MF4 files, raw CAN data using MATLAB Vehicle Network Toolbox and some MATLAB scripting to convert this data into well-behaved, well-formatted tabular data in MATLAB Timetables, and then we ultimately wrote that to Parquet files.
Parquet files are an Apache file format, they are really good for tabular representation, and you can use them in almost any software that you have. So at a high level, this is how we prepare the data for the modeling work we were about to do. And I'm showing this because in the interest of time, we can't go through all these details. I'm going to focus on the modeling side of things in the next piece.
So let's jump over to MATLAB and actually see how we did some of this. So I've got my MATLAB script here. And we're going to begin by considering where the data lives. So for the purposes of this demo, I'm going to use data on my hard drive. But I told you before, the data actually lived in the cloud.
And so when you're working with data in the cloud with MATLAB, you have to do two simple things. Instead of C, colon, My Data or S, colon, some path to my network file share, you specify where the data lives on the cloud. In this case, I was dealing with an Amazon S3 bucket. So you would define a UI. So s3 path to my bucket. And then you have to provide some sort of credentials to get into that. Do you have permission to actually touch that data?
And one of the newest things we have is a new function called load loadenv which takes a plain text credentials .env file and allows you to take your credentials out of code. You load that when you need it, the credentials aren't actually part of your code.
And if you're wondering what these credentials actually look like, for AWS, they contain things like access keys and tokens. If you were working with Azure or GCP, your information is going to be a little bit different, but it's the same idea. And we support all three cloud platforms.
So I talked about having Parquet files as the place to start. So all of my data was preprocessed into Parquet files. I'm going to use a data store to represent that large collection of files. And a data store is an object that says, hey, tell me about where your data lives and I'll help you manage that. I'll create the list of files for you, I'll give you an infrastructure to iterate through them.
So in this case, because I'm using Parquet, I'm using a Parquet data store. There's other data stores for tabular text, for images, et cetera. And what we get back when we call read on data store, is a table or a timetable in this case. And so each row of my timetable here is an instant in time, and then each column of my table is one of the different sensor values.
And one of the important things you want to do is typically for any given analysis, you're going to want to set some sort of selected channels. In my case, I had 120 logged channels. You typically only need half-a-dozen or a dozen for any given analysis. So it'll shrink your tables down from really wide ones to narrower ones. That way, you're dealing with less data, you're having to wait less time to do that.
So I've got data in my table, and then one of the first things I do, of course, is create a picture. I like to try to visualize as much as I can interactively in MATLAB so that I can understand relationships. And one of the main points here that I want to show you is that when you're dealing with time series data, especially data from data loggers, there can be cases where there's big gaps in time.
Now this may just be the machine was off, turned off, no big deal, but you need to think about how you're going to deal with that. If you're doing any sort of interpolation work or you're looking for regions of interest, pay attention to this because the data is not going to necessarily be contiguously sampled. This is a data quality, data cleanliness type of discussion.
Once we have some understanding of our data itself, we're going to think about our model. And so to estimate state of health, I'm using a very simple model where I combine resistance and capacity with a couple of coefficients. So I'm going to just try to create the simplest model possible knowing that down the line, I could always drop in something more complicated. I could use AI, but if I build the process with a simple model, I can explain it to others, I can test frequently and iterate a lot through this.
So to do this, I have to be able to estimate internal resistance, and that turns out to be delta V over delta I here. And then for capacity estimation, we're going to use those quick charge events and we're going to integrate capacity over state of charge. So that's a bunch about the equations. I like to try to turn that into pictures because I find that easier to do.
And so just to set the stage, and I showed this on the app, but a little bit of a zoomed-in version of this. So for the labeled data-- and this labeling is not necessarily a feature or a function in a given toolbox. This was me sitting down with some colleagues building some MATLAB functions to do this detection and return these results.
So within the orange segments, that's where we're going to apply our calculations for all of our year-and-a-half's worth of data. I'm showing you on one, but we're going to apply this to everything.
And then visually, if we go down to here and we look at this, this is taking 24 hours of-- on the top, I've got Current, in the middle I've got voltage, in the bottom is State of Charge, and then the gray vertical bar is the quick charge event, if you will. I'm able to detect a bit of a step change in current voltage, and that's how I compute DVM over DI.
And then within the gray box, that's also the area that I integrate current, and I'll look at the delta SOC on the bottom there. So if all this worked out and I could do this on one-- and then-- yeah, this is just a quick representation of the integrated current and then the-- I fit a simple line to the integrated current, the slope of that line becomes my estimate for that little segment in time.
And I take this and-- the table that I showed you earlier that had all the different start and end times-- so for each of the-- I don't know, there's like 300 different quick charge events in a year and a half, something like that. Now I start to add columns onto this table that represent my estimations. So internal resistance, state of charge at the beginning of the event, and the estimated capacity.
So we're starting to see where Matlab tables and timetables become extremely useful to keep track of all this stuff. Being able to calculate it sometimes isn't enough. You need to calculate it and keep track of it.
So we do all that, and then we want to actually look at state of charge during the-- we know all of our calculations are heavily dependent on state of charge. So we can take the state of charge at the beginning of each quick charge event and look at the distribution there. And what we find is that for 90% of our quick charge events, state of charge is between 0 and 40%.
And this is, again, just helping us scope out the problem and figure out where do we want to try to apply our estimations. So we're only going to use state of charge between 0 and 40% and ignore everything else for now as a first pass through our data. And we're still on the exploratory phase now of our data journey.
And of course, again, going back to tables, because I want to keep track of all this, if I had to explain this to someone walk into a design review, I've got this as a MATLAB table. I could dump this to Excel, I could write this out in a report, and it's traceability for what I did and rationale for what I did.
So we can take all of our estimations and plot those versus time, and a little description here. So on the top is our estimated capacity versus time, and the bottom is our estimated internal resistance versus time. The purple dots represent ambient temperature.
And this is important because in the first incantation of this plot, temperature wasn't there. And we saw this sinusoidal effect and we said, what is that? And it turns out, there is actually a temperature effect going on that we're observing in the logged data.
And it does start to make a little bit of sense. If we look in the bottom plot, anywhere where the temperature is low like in January, the resistance is high. And then in August and July, the temperature is high and the resistance is low.
So from a physics perspective, we know we're on the right track, but we also know that we have this temperature effect that we have to wrestle with somehow. And the colors of the other dots represent the SOC bins just because we wanted to look at that in another view.
The next thing we did is map all the temperatures. So if I look over here, this is a little bit of a-- I guess an advertisement, that we have a new data type in Matlab. in R2022b, we added dictionary as a brand new type. So I got the weather data from NOAA and I map the dates to the temperatures. And then in my table where I collect up all this data, I just use the dictionary to add the values to it.
And we can do some additional visualization if we want to look at internal resistance versus 1 over temperature. This just tells us that the temperature behavior is consistent across the bins we're looking at of SOC. The next thing we tried to do is use the erroneous equation to say, can we take resistance and get that to some sort of a reference temperature, see if we can remove that temperature effect from the data itself?
So we applied this information-- I chose 23 C as just a place to do that. We do that. We go ahead and again continue to add all of these calculations to our table as we go along. And then ultimately, visualize this to try to see what's happening here. In this case, I'm looking at these-- this is the temperature corrected DCI versus month, looking for any sort of statistical significance or trends or any sort of behaviors I can tease out of this.
A lot of exploration here. Wrap this all up and going back to our model that I talked about in the beginning, so this is our simple state of health estimation. And when you take all of that together and apply that into the end result model, you get this. This is estimated state of health versus time. So again, about a year and a half worth of data.
Still have a significant temperature impact. Yes, I know my model predicts greater than 100%, so I totally get that. We think there's something going on with the way that state of charge is estimated. So as a next step, that's what we're going to look at and try to figure out what's going on with that, but at least we have a workflow in place that gets us from raw data to potential model, and we can now continue to iterate and refine on this.
So from what did we see and do here perspective, let's talk about a few things. We talked about data access. Showed you local data, talked about data in the cloud because that's really important to be able to do that. Number two, labeling. You have to be able to label stuff.
Now I'm using it just to perform estimates in certain time windows, but labeling opens us up for a wide range of AI opportunities and possibilities. We talked about setting expectations with state of charge. We looked at bringing in additional outside data sources-- in this case, temperature data from NOAA. So you can bring that in to further enrich your story. We did some temperature correction. And then we use visualizations throughout to continue our understanding and increase our understanding.
All right, I'm going to hop back to slides. And I want to just reiterate the advertising said this was a battery-- a cloud-based digital twin. Cloud was used heavily in this project. The data that we-- this is real data, like real out in the world. Was presented to us in S3. That was not our choice. And the question was, hey, can you actually do anything with this? We said, sure, absolutely we can.
Number two, the dashboard I showed you I hosted locally just for the demo, but it can be hosted on AWS, Google, and Azure, and we have public-facing reference architectures for that on GitHub if that's what you wanted to do.
The data engineering pipeline was 100% cloud-based. Once I figured out how to do it on the desktop, I ran it all on a big MATLAB box on AWS with 96 cores, and also on MATLAB Parallel Server on AWS. So the bulk of the work-- and there was hundreds of thousands of these MF4 files. It wasn't a trivial amount. And then ultimately, you can connect to other software applications or cloud services like databases or other sort of visualization tools.
All right, so some next steps and some ideas for you. On May 4, we're actually going to have a fantastic webinar that goes even deeper into this topic. Gets even deeper into the DevOps side of things. This webinar is going to talk about things like source code management, and if on the frontend, you have a streaming pipeline a Kafka stream instead of a batch-based processing like we did. It'll touch on drift detection, state of health estimation.
So if you can, if you have time-- if I go back here for a second, on May the 4th, this is probably worth tuning into and checking out because it's going to go into a lot greater detail on these things.
I'd be remiss if I didn't talk about Simscape Battery, which is the newest addition to our Simscape family of tools. And this is designed to help you build parameterized models of battery packs. And if I had something other than just raw data files, if I had additional knowledge about the system, this is absolutely where I would start in order to help me build up that knowledge that I could build, model, and test.
And if you want to know more about this, two of our colleagues are in the demo area and can talk to you more about that, so that is something you can talk about.
I want to encourage you to invest in yourself, focus training. So everything I talked about today, it turns out, there's actually five for sure classes that are offered in the near-term that would continue to help you build your skills in this area of digital twin, around this trend.
And lastly, I want to say that a lot of people are tepid or a little bit intimidated by, how do I get my stuff out in the world? How do I handshake with other things? We have lots of workflows in MATLAB and Simulink to help you take your work out into the world.
So if you have questions, you have doubts, just come talk to us. We'd be happy to sit down with you, try to understand your needs, and help you get from A to B in the best way that we can.
So just to wrap things up, digital twin definitely is a major trend, and it's affecting all of our lives. It's all about a journey and an evolution as you go. And we here at The MathWorks are excited to hopefully be a part of that journey and help support you along your way. Thank you, everyone.
[APPLAUSE]