HOME > 상세정보

상세정보

Neural network methods for natural language processing

Neural network methods for natural language processing (13회 대출)

자료유형
단행본
개인저자
Goldberg, Yoav.
서명 / 저자사항
Neural network methods for natural language processing / Yoav Goldberg.
발행사항
[San Rafael, California] :   Morgan & Claypool Publishers,   c2017.  
형태사항
xxii, 287 p. : ill. ; 24 cm.
총서사항
Synthesis lectures on human language technologies ;#37
ISBN
9781627052986
서지주기
Includes bibliographical references.
일반주제명
Natural language processing (Computer science). Neural networks (Computer science).
000 00000nam u2200205 a 4500
001 000045910987
005 20170725102149
008 170725s2017 caua b 000 0 eng d
020 ▼a 9781627052986
040 ▼a 211009 ▼c 211009 ▼d 211009
082 0 4 ▼a 006.35 ▼2 23
084 ▼a 006.35 ▼2 DDCK
090 ▼a 006.35 ▼b G618n
100 1 ▼a Goldberg, Yoav.
245 1 0 ▼a Neural network methods for natural language processing / ▼c Yoav Goldberg.
260 ▼a [San Rafael, California] : ▼b Morgan & Claypool Publishers, ▼c c2017.
300 ▼a xxii, 287 p. : ▼b ill. ; ▼c 24 cm.
490 1 ▼a Synthesis lectures on human language technologies ; ▼v #37
504 ▼a Includes bibliographical references.
650 0 ▼a Natural language processing (Computer science).
650 0 ▼a Neural networks (Computer science).
830 0 ▼a Synthesis lectures on human language technologies ; ▼v #37.
945 ▼a KLPA

No. 소장처 청구기호 등록번호 도서상태 반납예정일 예약 서비스
No. 1 소장처 중앙도서관/서고6층/ 청구기호 006.35 G618n 등록번호 111782534 (5회 대출) 도서상태 대출가능 반납예정일 예약 서비스 B M
No. 2 소장처 과학도서관/Sci-Info(2층서고)/ 청구기호 006.35 G618n 등록번호 121241040 (8회 대출) 도서상태 대출가능 반납예정일 예약 서비스 B M
No. 소장처 청구기호 등록번호 도서상태 반납예정일 예약 서비스
No. 1 소장처 중앙도서관/서고6층/ 청구기호 006.35 G618n 등록번호 111782534 (5회 대출) 도서상태 대출가능 반납예정일 예약 서비스 B M
No. 소장처 청구기호 등록번호 도서상태 반납예정일 예약 서비스
No. 1 소장처 과학도서관/Sci-Info(2층서고)/ 청구기호 006.35 G618n 등록번호 121241040 (8회 대출) 도서상태 대출가능 반납예정일 예약 서비스 B M

컨텐츠정보

책소개

Neural networks are a family of powerful machine learning models. This book focuses on the application of neural network models to natural language data. The first half of the book (Parts I and II) covers the basics of supervised machine learning and feed-forward neural networks, the basics of working with machine learning over language data, and the use of vector-based rather than symbolic representations for words. It also covers the computation-graph abstraction, which allows to easily define and train arbitrary neural networks, and is the basis behind the design of contemporary neural network software libraries.

The second part of the book (Parts III and IV) introduces more specialized neural network architectures, including 1D convolutional neural networks, recurrent neural networks, conditioned-generation models, and attention-based models. These architectures and techniques are the driving force behind state-of-the-art algorithms for machine translation, syntactic parsing, and many other applications. Finally, we also discuss tree-shaped networks, structured prediction, and the prospects of multi-task learning.

Neural networks are a family of powerful machine learning models. This book focuses on the application of neural network models to natural language data. The first half of the book covers the basics of supervised machine learning and feed-forward neural networks. The second part of the book introduces more specialized neural network architectures.


정보제공 : Aladin

목차

21. Conclusion
21.1 What have we seen?
21.2 The challenges ahead
Bibliography
Author''s biography.
20. Cascaded, multi-task and semi-supervised learning
20.1 Model cascading
20.2 Multi-task learning
20.2.1 Training in a multi-task setup
20.2.2 Selective sharing
20.2.3 Word-embeddings pre-training as multi-task learning
20.2.4 Multi-task learning in conditioned generation
20.2.5 Multi-task learning as regularization
20.2.6 Caveats
20.3 Semi-supervised learning
20.4 Examples
20.4.1 Gaze-prediction and sentence compression
20.4.2 Arc labeling and syntactic parsing
20.4.3 Preposition sense disambiguation and preposition translation prediction
20.4.4 Conditioned generation: multilingual machine translation, parsing, and image captioning
20.5 Outlook
19. Structured output prediction
19.1 Search-based structured prediction
19.1.1 Structured prediction with linear models
19.1.2 Nonlinear structured prediction
19.1.3 Probabilistic objective (CRF)
19.1.4 Approximate search
19.1.5 Reranking
19.1.6 See also
19.2 Greedy structured prediction
19.3 Conditional generation as structured output prediction
19.4 Examples
19.4.1 Search-based structured prediction: first-order dependency parsing
19.4.2 Neural-CRF for named entity recognition
19.4.3 Approximate NER-CRF with beam-search
Part IV. Additional topics
18. Modeling trees with recursive neural networks
18.1 Formal definition
18.2 Extensions and variations
18.3 Training recursive neural networks
18.4 A simple alternative-linearized trees
18.5 Outlook
17. Conditioned generation
17.1 RNN generators
17.1.1 Training generators
17.2 Conditioned generation (encoder-decoder)
17.2.1 Sequence to sequence models
17.2.2 Applications
17.2.3 Other conditioning contexts
17.3 Unsupervised sentence similarity
17.4 Conditioned generation with attention
17.4.1 Computational complexity
17.4.2 Interpretability
17.5 Attention-based models in NLP
17.5.1 Machine translation
17.5.2 Morphological inflection
17.5.3 Syntactic parsing
16. Modeling with recurrent networks
16.1 Acceptors
16.1.1 Sentiment classification
16.1.2 Subject-verb agreement grammaticality detection
16.2 RNNs as feature extractors
16.2.1 Part-of-speech tagging
16.2.2 RNN-CNN document classification
16.2.3 Arc-factored dependency parsing
15. Concrete recurrent neural network architectures
15.1 CBOW as an RNN
15.2 Simple RNN
15.3 Gated architectures
15.3.1 LSTM
15.3.2 GRU
15.4 Other variants
15.5 Dropout in RNNs
Part III. Specialized architectures
13. Ngram detectors: convolutional neural networks
13.1 Basic convolution + pooling
13.1.1 1D convolutions over text
13.1.2 Vector pooling
13.1.3 Variations
13.2 Alternative: feature hashing
13.3 Hierarchical convolutions
12. Case study: a feed-forward architecture for sentence meaning inference
12.1 Natural language inference and the SNLI dataset
12.2 A textual similarity network
11. Using word embeddings
11.1 Obtaining word vectors
11.2 Word similarity
11.3 Word clustering
11.4 Finding similar words
11.4.1 Similarity to a group of words
11.5 Odd-one out
11.6 Short document similarity
11.7 Word analogies
11.8 Retrofitting and projections
11.9 Practicalities and pitfalls
10. Pre-trained word representations
10.1 Random initialization
10.2 Supervised task-specific pre-training
10.3 Unsupervised pre-training
10.3.1 Using pre-trained embeddings
10.4 Word embedding algorithms
10.4.1 Distributional hypothesis and word representations
10.4.2 From neural language models to distributed representations
10.4.3 Connecting the worlds
10.4.4 Other algorithms
10.5 The choice of contexts
10.5.1 Window approach
10.5.2 Sentences, paragraphs, or documents
10.5.3 Syntactic window
10.5.4 Multilingual
10.5.5 Character-based and sub-word representations
10.6 Dealing with multi-word units and word inflections
10.7 Limitations of distributional methods
9. Language modeling
9.1 The language modeling task
9.2 Evaluating language models: perplexity
9.3 Traditional approaches to language modeling
9.3.1 Further reading
9.3.2 Limitations of traditional language models
9.4 Neural language models
9.5 Using language models for generation
9.6 Byproduct: word representations
8. From textual features to inputs
8.1 Encoding categorical features
8.1.1 One-hot encodings
8.1.2 Dense encodings (feature embeddings)
8.1.3 Dense vectors vs. one-hot representations
8.2 Combining dense vectors
8.2.1 Window-based features
8.2.2 Variable number of features: continuous bag of words
8.3 Relation between one-hot and dense vectors
8.4 Odds and ends
8.4.1 Distance and position features
8.4.2 Padding, unknown words, and word dropout
8.4.3 Feature combinations
8.4.4 Vector sharing
8.4.5 Dimensionality
8.4.6 Embeddings vocabulary
8.4.7 Network''s output
8.5 Example: part-of-speech tagging
8.6 Example: arc-factored parsing
7. Case studies of NLP features
7.1 Document classification: language identification
7.2 Document classification: topic classification
7.3 Document classification: authorship attribution
7.4 Word-in-context: part of speech tagging
7.5 Word-in-context: named entity recognition
7.6 Word in context, linguistic features: preposition sense disambiguation
7.7 Relation between words in context: arc-factored parsing
Part II. Working with natural language data
6. Features for textual data
6.1 Typology of NLP classification problems
6.2 Features for NLP problems
6.2.1 Directly observable properties
6.2.2 Inferred linguistic properties
6.2.3 Core features vs. combination features
6.2.4 Ngram features
6.2.5 Distributional features
5. Neural network training
5.1 The computation graph abstraction
5.1.1 Forward computation
5.1.2 Backward computation (derivatives, backprop)
5.1.3 Software
5.1.4 Implementation recipe
5.1.5 Network composition
5.2 Practicalities
5.2.1 Choice of optimization algorithm
5.2.2 Initialization
5.2.3 Restarts and ensembles
5.2.4 Vanishing and exploding gradients
5.2.5 Saturation and dead neurons
5.2.6 Shuffling
5.2.7 Learning rate
5.2.8 Minibatches
4. Feed-forward neural networks
4.1 A brain-inspired metaphor
4.2 In mathematical notation
4.3 Representation power
4.4 Common nonlinearities
4.5 Loss functions
4.6 Regularization and dropout
4.7 Similarity and distance layers
4.8 Embedding layers
3. From linear models to multi-layer perceptrons
3.1 Limitations of linear models: The XOR problem
3.2 Nonlinear input transformations
3.3 Kernel methods
3.4 Trainable mapping functions
Part I. Supervised classification and feed-forward neural networks
2. Learning basics and linear models
2.1 Supervised learning and parameterized functions
2.2 Train, test, and validation sets
2.3 Linear models
2.3.1 Binary classification
2.3.2 Log-linear binary classification
2.3.3 Multi-class classification
2.4 Representations
2.5 One-hot and dense vector representations
2.6 Log-linear multi-class classification
2.7 Training as optimization
2.7.1 Loss functions
2.7.2 Regularization
2.8 Gradient-based optimization
2.8.1 Stochastic gradient descent
2.8.2 Worked-out example
2.8.3 Beyond SGD
14. Recurrent neural networks: modeling sequences and stacks
14.1 The RNN abstraction
14.2 RNN training
14.3 Common RNN usage-patterns
14.3.1 Acceptor
14.3.2 Encoder
14.3.3 Transducer
14.4 Bidirectional RNNs (biRNN)
14.5 Multi-layer (stacked) RNNs
14.6 RNNs for representing stacks
14.7 A note on reading the literature
1. Introduction
1.1 The challenges of natural language processing
1.2 Neural networks and deep learning
1.3 Deep learning in NLP
1.3.1 Success stories
1.4 Coverage and organization
1.5 What''s not covered
1.6 A note on terminology
1.7 Mathematical notation

관련분야 신착자료

Dyer-Witheford, Nick (2026)
양성봉 (2025)