- Preprocess the CSV file in Python
- Load and manipulate the cleaned data in Matlab
Calling Python from Matlab - remove header/string from text files
3 views (last 30 days)
Show older comments
Hi all,
I need to open a text file (attached) to handle the data within it. Normally, I would do it with Python using pandas, but now, for some reason, I want to write a code in Matlab. However, I want to outsource specific tasks from Matlab to Python for the conviniece of it (I open for pure Matlab solutions also).
- this function
fileData = py.pandas.read_csv(fullName, '\t')
gives me this header handling error
'Python Error: ParserError: Expected 1 fields in line 9, saw 14. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.'
Indeed, when removing manually the first 8 lines, I can import the data as DataFrame, but can't really work with it as I do directly in Matlab. So, I have second the issue:
2. this function
b = py.pandas.core.frame.DataFrame(fileData.T, {' Elapsed time(sec)'})
gives me as ans
b =
Python DataFrame with properties:
T: [1×1 py.pandas.core.frame.DataFrame]
at: [1×1 py.pandas.core.indexing._AtIndexer]
attrs: [1×1 py.dict]
axes: [1×2 py.list]
columns: [1×1 py.pandas.core.indexes.range.RangeIndex]
dtypes: [1×1 py.pandas.core.series.Series]
empty: 0
iat: [1×1 py.pandas.core.indexing._iAtIndexer]
iloc: [1×1 py.pandas.core.indexing._iLocIndexer]
index: [1×1 py.pandas.core.indexes.base.Index]
loc: [1×1 py.pandas.core.indexing._LocIndexer]
ndim: [1×1 py.int]
shape: [1×2 py.tuple]
size: [1×1 py.numpy.int32]
style: [1×1 py.pandas.io.formats.style.Styler]
values: [1×1 py.numpy.ndarray]
0 1 2 3 4 5 6 7
Elapsed time(sec) 1.344 2.687 4.047 5.422 6.765 8.14 9.484 10.828
So it displays me the column, but I can not really make any operations with it, I can not address elements from the 'array'.
0 Comments
Answers (1)
surya venu
on 17 Jul 2024
Hi,
It seems like you have a couple of issues when handling the data between Matlab and Python.
1: Reading the CSV File
The error you are encountering is due to the presence of extra lines at the beginning of the file, which are not part of the actual data. A good approach is to handle this preprocessing step in Python before passing the data to Matlab.
2: Manipulating the DataFrame
You are having trouble manipulating the DataFrame in Matlab after loading it from Python. This can be due to the way Matlab handles Python objects.
Solution
We can write a Python script to preprocess the file, and then use Matlab to read and manipulate the cleaned data. Below are the steps:
Step 1: Preprocess the CSV File in Python
Create a Python script to clean the CSV file by removing the first 8 lines.
# preprocess_csv.py
import pandas as pd
import sys
def preprocess_csv(input_file, output_file):
with open(input_file, 'r') as file:
lines = file.readlines()
# Remove the first 8 lines
cleaned_lines = lines[8:]
with open(output_file, 'w') as file:
file.writelines(cleaned_lines)
df = pd.read_csv(output_file, delimiter='\t')
return df
if __name__ == "__main__":
input_file = sys.argv[1]
output_file = sys.argv[2]
df = preprocess_csv(input_file, output_file)
print(df)
Run this script from Matlab using the "system" command or manually from the command line.
Step 2: Load and Manipulate Data in Matlab
After preprocessing the file, you can load the cleaned CSV file in Matlab and manipulate the data.
input_file = 'path/to/your/input_file.txt';
output_file = 'path/to/your/output_file.txt';
system(['python preprocess_csv.py ', input_file, ' ', output_file]);
fileData = py.pandas.read_csv(output_file, '\t');
data = double(fileData.values);
elapsed_time = data(:, 1);
Hope it helps.
0 Comments
See Also
Categories
Find more on Call Python from MATLAB in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!