How to use the dataset in Visual question Answering
1 view (last 30 days)
Show older comments
I am working in visual question answering problem that accept image and question about it. Then it generates an answer of the question. I built a network that has two parts: the first one is CNN model that handels image as input. the second one is enseble model LSTM+BiLSTM that handles the text. I have the dataset has column for image path, question, and answer. I made all preprocessing steps for the dataset. My problem now how to tell the model to take image and text and perform them seprately and then makes fusion between them.
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/720824/image.png)
above is the network I built. in layer has to accept text which is a question. im_in has to accept image. I don't know how to handle the dataset.
Can you suggest a specific method for building model for visual quesrion_answering problem in matlab.
regards,
0 Comments
Answers (1)
Prince Kumar
on 19 Nov 2021
Hi Suheer Al-Hadhrami,
You can make use of 'Multiple-Input Networks".
Please refer to the documentation for the same : https://www.mathworks.com/help/deeplearning/ug/multiple-input-and-multiple-output-networks.html
The following link might be useful too
0 Comments
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!