One of the first things to deal with when analyzing linguistic data with a neural network is how to convert linguistic items likes words and sentences to lists of numbers (vectors) so that networks can process them ("word embedding"). For an overview of this topic you could try googling "word2vec".
In the earlier connectionist literature there are Elman's classic papers (e.g. Finding Structure in Time), Seidenberg's triangle model, and some of the chapters in the PDP volumes, including Smolensky's, which evolved in to optimality theory, which is now dominant in phonology according to my colleagues. Here is a review I just found.
Simbrain 3 has kind of primitive word embedding capabilities (no word2vec yet). I'm actually planning to finally make a new youtube video, and i will cover this. Hopefully in the next month.
Also more broadly on the topic of tutorials, I am teaching a course using Simbrain, and have co-written an open-content book on the topic. I'm still polishing it up but I hope to release it as free modifiable text on a creative commons license maybe around December.
For papers you are kind of stuck with what's publicly available. But there is a lot publicly available so that should keep you going for a while. Also you can often email the authors and ask for a copy.
For introductions I like my own, which I mentioned above :) But that's still a ways from being done so in the meantime you can search for Fausett, Haykin and Rojas (search for those names + neural network). There are others as well. Also there is the Goodfellow et. al. deep learning book which is quite current