| 000 | 00000nam u2200205 a 4500 | |
| 001 | 000045910987 | |
| 005 | 20170725102149 | |
| 008 | 170725s2017 caua b 000 0 eng d | |
| 020 | ▼a 9781627052986 | |
| 040 | ▼a 211009 ▼c 211009 ▼d 211009 | |
| 082 | 0 4 | ▼a 006.35 ▼2 23 |
| 084 | ▼a 006.35 ▼2 DDCK | |
| 090 | ▼a 006.35 ▼b G618n | |
| 100 | 1 | ▼a Goldberg, Yoav. |
| 245 | 1 0 | ▼a Neural network methods for natural language processing / ▼c Yoav Goldberg. |
| 260 | ▼a [San Rafael, California] : ▼b Morgan & Claypool Publishers, ▼c c2017. | |
| 300 | ▼a xxii, 287 p. : ▼b ill. ; ▼c 24 cm. | |
| 490 | 1 | ▼a Synthesis lectures on human language technologies ; ▼v #37 |
| 504 | ▼a Includes bibliographical references. | |
| 650 | 0 | ▼a Natural language processing (Computer science). |
| 650 | 0 | ▼a Neural networks (Computer science). |
| 830 | 0 | ▼a Synthesis lectures on human language technologies ; ▼v #37. |
| 945 | ▼a KLPA |
소장정보
| No. | 소장처 | 청구기호 | 등록번호 | 도서상태 | 반납예정일 | 예약 | 서비스 |
|---|---|---|---|---|---|---|---|
| No. 1 | 소장처 중앙도서관/서고6층/ | 청구기호 006.35 G618n | 등록번호 111782534 (5회 대출) | 도서상태 대출가능 | 반납예정일 | 예약 | 서비스 |
| No. 2 | 소장처 과학도서관/Sci-Info(2층서고)/ | 청구기호 006.35 G618n | 등록번호 121241040 (8회 대출) | 도서상태 대출가능 | 반납예정일 | 예약 | 서비스 |
| No. | 소장처 | 청구기호 | 등록번호 | 도서상태 | 반납예정일 | 예약 | 서비스 |
|---|---|---|---|---|---|---|---|
| No. 1 | 소장처 중앙도서관/서고6층/ | 청구기호 006.35 G618n | 등록번호 111782534 (5회 대출) | 도서상태 대출가능 | 반납예정일 | 예약 | 서비스 |
| No. | 소장처 | 청구기호 | 등록번호 | 도서상태 | 반납예정일 | 예약 | 서비스 |
|---|---|---|---|---|---|---|---|
| No. 1 | 소장처 과학도서관/Sci-Info(2층서고)/ | 청구기호 006.35 G618n | 등록번호 121241040 (8회 대출) | 도서상태 대출가능 | 반납예정일 | 예약 | 서비스 |
컨텐츠정보
책소개
Neural networks are a family of powerful machine learning models. This book focuses on the application of neural network models to natural language data. The first half of the book (Parts I and II) covers the basics of supervised machine learning and feed-forward neural networks, the basics of working with machine learning over language data, and the use of vector-based rather than symbolic representations for words. It also covers the computation-graph abstraction, which allows to easily define and train arbitrary neural networks, and is the basis behind the design of contemporary neural network software libraries.
The second part of the book (Parts III and IV) introduces more specialized neural network architectures, including 1D convolutional neural networks, recurrent neural networks, conditioned-generation models, and attention-based models. These architectures and techniques are the driving force behind state-of-the-art algorithms for machine translation, syntactic parsing, and many other applications. Finally, we also discuss tree-shaped networks, structured prediction, and the prospects of multi-task learning.
Neural networks are a family of powerful machine learning models. This book focuses on the application of neural network models to natural language data. The first half of the book covers the basics of supervised machine learning and feed-forward neural networks. The second part of the book introduces more specialized neural network architectures.
정보제공 :
목차
21. Conclusion 21.1 What have we seen? 21.2 The challenges ahead Bibliography Author''s biography. 20. Cascaded, multi-task and semi-supervised learning 20.1 Model cascading 20.2 Multi-task learning 20.2.1 Training in a multi-task setup 20.2.2 Selective sharing 20.2.3 Word-embeddings pre-training as multi-task learning 20.2.4 Multi-task learning in conditioned generation 20.2.5 Multi-task learning as regularization 20.2.6 Caveats 20.3 Semi-supervised learning 20.4 Examples 20.4.1 Gaze-prediction and sentence compression 20.4.2 Arc labeling and syntactic parsing 20.4.3 Preposition sense disambiguation and preposition translation prediction 20.4.4 Conditioned generation: multilingual machine translation, parsing, and image captioning 20.5 Outlook 19. Structured output prediction 19.1 Search-based structured prediction 19.1.1 Structured prediction with linear models 19.1.2 Nonlinear structured prediction 19.1.3 Probabilistic objective (CRF) 19.1.4 Approximate search 19.1.5 Reranking 19.1.6 See also 19.2 Greedy structured prediction 19.3 Conditional generation as structured output prediction 19.4 Examples 19.4.1 Search-based structured prediction: first-order dependency parsing 19.4.2 Neural-CRF for named entity recognition 19.4.3 Approximate NER-CRF with beam-search Part IV. Additional topics 18. Modeling trees with recursive neural networks 18.1 Formal definition 18.2 Extensions and variations 18.3 Training recursive neural networks 18.4 A simple alternative-linearized trees 18.5 Outlook 17. Conditioned generation 17.1 RNN generators 17.1.1 Training generators 17.2 Conditioned generation (encoder-decoder) 17.2.1 Sequence to sequence models 17.2.2 Applications 17.2.3 Other conditioning contexts 17.3 Unsupervised sentence similarity 17.4 Conditioned generation with attention 17.4.1 Computational complexity 17.4.2 Interpretability 17.5 Attention-based models in NLP 17.5.1 Machine translation 17.5.2 Morphological inflection 17.5.3 Syntactic parsing 16. Modeling with recurrent networks 16.1 Acceptors 16.1.1 Sentiment classification 16.1.2 Subject-verb agreement grammaticality detection 16.2 RNNs as feature extractors 16.2.1 Part-of-speech tagging 16.2.2 RNN-CNN document classification 16.2.3 Arc-factored dependency parsing 15. Concrete recurrent neural network architectures 15.1 CBOW as an RNN 15.2 Simple RNN 15.3 Gated architectures 15.3.1 LSTM 15.3.2 GRU 15.4 Other variants 15.5 Dropout in RNNs Part III. Specialized architectures 13. Ngram detectors: convolutional neural networks 13.1 Basic convolution + pooling 13.1.1 1D convolutions over text 13.1.2 Vector pooling 13.1.3 Variations 13.2 Alternative: feature hashing 13.3 Hierarchical convolutions 12. Case study: a feed-forward architecture for sentence meaning inference 12.1 Natural language inference and the SNLI dataset 12.2 A textual similarity network 11. Using word embeddings 11.1 Obtaining word vectors 11.2 Word similarity 11.3 Word clustering 11.4 Finding similar words 11.4.1 Similarity to a group of words 11.5 Odd-one out 11.6 Short document similarity 11.7 Word analogies 11.8 Retrofitting and projections 11.9 Practicalities and pitfalls 10. Pre-trained word representations 10.1 Random initialization 10.2 Supervised task-specific pre-training 10.3 Unsupervised pre-training 10.3.1 Using pre-trained embeddings 10.4 Word embedding algorithms 10.4.1 Distributional hypothesis and word representations 10.4.2 From neural language models to distributed representations 10.4.3 Connecting the worlds 10.4.4 Other algorithms 10.5 The choice of contexts 10.5.1 Window approach 10.5.2 Sentences, paragraphs, or documents 10.5.3 Syntactic window 10.5.4 Multilingual 10.5.5 Character-based and sub-word representations 10.6 Dealing with multi-word units and word inflections 10.7 Limitations of distributional methods 9. Language modeling 9.1 The language modeling task 9.2 Evaluating language models: perplexity 9.3 Traditional approaches to language modeling 9.3.1 Further reading 9.3.2 Limitations of traditional language models 9.4 Neural language models 9.5 Using language models for generation 9.6 Byproduct: word representations 8. From textual features to inputs 8.1 Encoding categorical features 8.1.1 One-hot encodings 8.1.2 Dense encodings (feature embeddings) 8.1.3 Dense vectors vs. one-hot representations 8.2 Combining dense vectors 8.2.1 Window-based features 8.2.2 Variable number of features: continuous bag of words 8.3 Relation between one-hot and dense vectors 8.4 Odds and ends 8.4.1 Distance and position features 8.4.2 Padding, unknown words, and word dropout 8.4.3 Feature combinations 8.4.4 Vector sharing 8.4.5 Dimensionality 8.4.6 Embeddings vocabulary 8.4.7 Network''s output 8.5 Example: part-of-speech tagging 8.6 Example: arc-factored parsing 7. Case studies of NLP features 7.1 Document classification: language identification 7.2 Document classification: topic classification 7.3 Document classification: authorship attribution 7.4 Word-in-context: part of speech tagging 7.5 Word-in-context: named entity recognition 7.6 Word in context, linguistic features: preposition sense disambiguation 7.7 Relation between words in context: arc-factored parsing Part II. Working with natural language data 6. Features for textual data 6.1 Typology of NLP classification problems 6.2 Features for NLP problems 6.2.1 Directly observable properties 6.2.2 Inferred linguistic properties 6.2.3 Core features vs. combination features 6.2.4 Ngram features 6.2.5 Distributional features 5. Neural network training 5.1 The computation graph abstraction 5.1.1 Forward computation 5.1.2 Backward computation (derivatives, backprop) 5.1.3 Software 5.1.4 Implementation recipe 5.1.5 Network composition 5.2 Practicalities 5.2.1 Choice of optimization algorithm 5.2.2 Initialization 5.2.3 Restarts and ensembles 5.2.4 Vanishing and exploding gradients 5.2.5 Saturation and dead neurons 5.2.6 Shuffling 5.2.7 Learning rate 5.2.8 Minibatches 4. Feed-forward neural networks 4.1 A brain-inspired metaphor 4.2 In mathematical notation 4.3 Representation power 4.4 Common nonlinearities 4.5 Loss functions 4.6 Regularization and dropout 4.7 Similarity and distance layers 4.8 Embedding layers 3. From linear models to multi-layer perceptrons 3.1 Limitations of linear models: The XOR problem 3.2 Nonlinear input transformations 3.3 Kernel methods 3.4 Trainable mapping functions Part I. Supervised classification and feed-forward neural networks 2. Learning basics and linear models 2.1 Supervised learning and parameterized functions 2.2 Train, test, and validation sets 2.3 Linear models 2.3.1 Binary classification 2.3.2 Log-linear binary classification 2.3.3 Multi-class classification 2.4 Representations 2.5 One-hot and dense vector representations 2.6 Log-linear multi-class classification 2.7 Training as optimization 2.7.1 Loss functions 2.7.2 Regularization 2.8 Gradient-based optimization 2.8.1 Stochastic gradient descent 2.8.2 Worked-out example 2.8.3 Beyond SGD 14. Recurrent neural networks: modeling sequences and stacks 14.1 The RNN abstraction 14.2 RNN training 14.3 Common RNN usage-patterns 14.3.1 Acceptor 14.3.2 Encoder 14.3.3 Transducer 14.4 Bidirectional RNNs (biRNN) 14.5 Multi-layer (stacked) RNNs 14.6 RNNs for representing stacks 14.7 A note on reading the literature 1. Introduction 1.1 The challenges of natural language processing 1.2 Neural networks and deep learning 1.3 Deep learning in NLP 1.3.1 Success stories 1.4 Coverage and organization 1.5 What''s not covered 1.6 A note on terminology 1.7 Mathematical notation
