word to word text generation example gives error "Invalid training data, labels must not contain undefined values"

9 views (last 30 days)
When attempting to run example
I get error
TrainNetwork
Invalid training data, labels must not contain undefined values
url = "https://www.gutenberg.org/files/11/11-h/11-h.htm";
code = webread(url);
tree = htmlTree(code);
selector = "p";
subtrees = findElement(tree,selector);
textData = extractHTMLText(subtrees);
textData(textData == "") = [];
documents = tokenizedDocument(textData);
ds = documentGenerationDatastore(documents);
ds = sort(ds);
inputSize = 1;
embeddingDimension = 100;
numWords = numel(ds.Encoding.Vocabulary);
numClasses = numWords + 1;
layers = [
sequenceInputLayer(inputSize)
wordEmbeddingLayer(embeddingDimension,numWords)
lstmLayer(100)
dropoutLayer(0.2)
fullyConnectedLayer(numClasses)
softmaxLayer
classificationLayer];
options = trainingOptions('adam', ...
'MaxEpochs',300, ...
'InitialLearnRate',0.01, ...
'MiniBatchSize',32, ...
'Shuffle','never', ...
'Plots','training-progress', ...
'Verbose',false);
net = trainNetwork(ds,layers,options);
enc = ds.Encoding;
wordIndex = word2ind(enc,"startOfText")
vocabulary = string(net.Layers(end).Classes);
generatedText = "";
maxLength = 500;
while strlength(generatedText) < maxLength
% Predict the next word scores.
[net,wordScores] = predictAndUpdateState(net,wordIndex,'ExecutionEnvironment','cpu');
% Sample the next word.
newWord = datasample(vocabulary,1,'Weights',wordScores);
% Stop predicting at the end of text.
if newWord == "EndOfText"
break
end
% Add the word to the generated text.
generatedText = generatedText + " " + newWord;
% Find the word index for the next input.
wordIndex = word2ind(enc,newWord);
end
punctuationCharacters = ["." "," "’" ")" ":" "?" "!"];
generatedText = replace(generatedText," " + punctuationCharacters,punctuationCharacters);
punctuationCharacters = ["(" "‘"];
generatedText = replace(generatedText,punctuationCharacters + " ",punctuationCharacters)
net = resetState(net);

Answers (0)

Products


Release

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!