HOME > 상세정보

상세정보

Introduction to data science [electronic resource] : a Python approach to concepts, techniques and applications

Introduction to data science [electronic resource] : a Python approach to concepts, techniques and applications

자료유형
E-Book(소장)
개인저자
Igual, Laura. Seguí, Santi.
서명 / 저자사항
Introduction to data science [electronic resource] : a Python approach to concepts, techniques and applications / Laura Igual, Santi Seguí.
발행사항
Cham :   Springer,   c2017.  
형태사항
1 online resource (xiv, 218 p.) : ill.
총서사항
Undergraduate Topics in Computer Science,1863-7310
ISBN
9783319500164 9783319500171 (e-book)
요약
This accessible and classroom-tested textbook/reference presents an introduction to the fundamentals of the emerging and interdisciplinary field of data science. The coverage spans key concepts adopted from statistics and machine learning, useful techniques for graph analysis and parallel programming, and the practical application of data science for such tasks as building recommender systems or performing sentiment analysis. Topics and features: Provides numerous practical case studies using real-world data throughout the book Supports understanding through hands-on experience of solving data science problems using Python Describes techniques and tools for statistical analysis, machine learning, graph analysis, and parallel programming Reviews a range of applications of data science, including recommender systems and sentiment analysis of text data Provides supplementary code resources and data at an associated website This practically-focused textbook provides an ideal introduction to the field for upper-tier undergraduate and beginning graduate students from computer science, mathematics, statistics, and other technical disciplines. The work is also eminently suitable for professionals on continuous education short courses, and to researchers following self-study courses. Dr. Laura Igual is an Associate Professor at the Departament de Matemàtiques i Informàtica, Universitat de Barcelona, Spain. Dr. Santi Seguí is an Assistant Professor at the same institution.
일반주기
Title from e-Book title page.  
내용주기
Introduction to Data Science -- Toolboxes for Data Scientists -- Descriptive statistics -- Statistical Inference -- Supervised Learning -- Regression Analysis -- Unsupervised Learning -- Network Analysis -- Recommender Systems -- Statistical Natural Language Processing for Sentiment Analysis -- Parallel Computing.
서지주기
Includes bibliographical references and index.
이용가능한 다른형태자료
Issued also as a book.  
일반주제명
Quantitative research. Python (Computer program language).
바로가기
URL
000 00000cam u2200205 a 4500
001 000045992210
005 20190805142708
006 m d
007 cr
008 190726s2017 sz a ob 001 0 eng d
020 ▼a 9783319500164
020 ▼a 9783319500171 (e-book)
040 ▼a 211009 ▼c 211009 ▼d 211009
050 4 ▼a QA76.9.D343
082 0 4 ▼a 001.42 ▼2 23
084 ▼a 001.42 ▼2 DDCK
090 ▼a 001.42
100 1 ▼a Igual, Laura.
245 1 0 ▼a Introduction to data science ▼h [electronic resource] : ▼b a Python approach to concepts, techniques and applications / ▼c Laura Igual, Santi Seguí.
260 ▼a Cham : ▼b Springer, ▼c c2017.
300 ▼a 1 online resource (xiv, 218 p.) : ▼b ill.
490 1 ▼a Undergraduate Topics in Computer Science, ▼x 1863-7310
500 ▼a Title from e-Book title page.
504 ▼a Includes bibliographical references and index.
505 0 ▼a Introduction to Data Science -- Toolboxes for Data Scientists -- Descriptive statistics -- Statistical Inference -- Supervised Learning -- Regression Analysis -- Unsupervised Learning -- Network Analysis -- Recommender Systems -- Statistical Natural Language Processing for Sentiment Analysis -- Parallel Computing.
520 ▼a This accessible and classroom-tested textbook/reference presents an introduction to the fundamentals of the emerging and interdisciplinary field of data science. The coverage spans key concepts adopted from statistics and machine learning, useful techniques for graph analysis and parallel programming, and the practical application of data science for such tasks as building recommender systems or performing sentiment analysis. Topics and features: Provides numerous practical case studies using real-world data throughout the book Supports understanding through hands-on experience of solving data science problems using Python Describes techniques and tools for statistical analysis, machine learning, graph analysis, and parallel programming Reviews a range of applications of data science, including recommender systems and sentiment analysis of text data Provides supplementary code resources and data at an associated website This practically-focused textbook provides an ideal introduction to the field for upper-tier undergraduate and beginning graduate students from computer science, mathematics, statistics, and other technical disciplines. The work is also eminently suitable for professionals on continuous education short courses, and to researchers following self-study courses. Dr. Laura Igual is an Associate Professor at the Departament de Matemàtiques i Informàtica, Universitat de Barcelona, Spain. Dr. Santi Seguí is an Assistant Professor at the same institution.
530 ▼a Issued also as a book.
538 ▼a Mode of access: World Wide Web.
650 0 ▼a Quantitative research.
650 0 ▼a Python (Computer program language).
700 1 ▼a Seguí, Santi.
830 0 ▼a Undergraduate Topics in Computer Science.
856 4 0 ▼u https://oca.korea.ac.kr/link.n2s?url=https://doi.org/10.1007/978-3-319-50017-1
945 ▼a KLPA
991 ▼a E-Book(소장)

소장정보

No. 소장처 청구기호 등록번호 도서상태 반납예정일 예약 서비스
No. 1 소장처 중앙도서관/e-Book 컬렉션/ 청구기호 CR 001.42 등록번호 E14016062 도서상태 대출불가(열람가능) 반납예정일 예약 서비스 M

컨텐츠정보

책소개

This accessible and classroom-tested textbook/reference presents an introduction to the fundamentals of the emerging and interdisciplinary field of data science. The coverage spans key concepts adopted from statistics and machine learning, useful techniques for graph analysis and parallel programming, and the practical application of data science for such tasks as building recommender systems or performing sentiment analysis. Topics and features: provides numerous practical case studies using real-world data throughout the book; supports understanding through hands-on experience of solving data science problems using Python; describes techniques and tools for statistical analysis, machine learning, graph analysis, and parallel programming; reviews a range of applications of data science, including recommender systems and sentiment analysis of text data; provides supplementary code resources and data at an associated website.



New feature

This accessible and classroom-tested textbook/reference presents an introduction to the fundamentals of the emerging and interdisciplinary field of data science. The coverage spans key concepts adopted from statistics and machine learning, useful techniques for graph analysis and parallel programming, and the practical application of data science for such tasks as building recommender systems or performing sentiment analysis.

Topics and features:

  • Provides numerous practical case studies using real-world data throughout the book
  • Supports understanding through hands-on experience of solving data science problems using Python
  • Describes techniques and tools for statistical analysis, machine learning, graph analysis, and parallel programming
  • Reviews a range of applications of data science, including recommender systems and sentiment analysis of text data
  • Provides supplementary code resources and data at an associated website

This practically-focused textbook provides an ideal introduction to the field for upper-tier undergraduate and beginning graduate students from computer science, mathematics, statistics, and other technical disciplines. The work is also eminently suitable for professionals on continuous education short courses, and to researchers following self-study courses.

Dr. Laura Igual is an Associate Professor at the Departament de Matematiques i Informatica, Universitat de Barcelona, Spain. Dr. Santi Segui is an Assistant Professor at the same institution.




정보제공 : Aladin

목차

CONTENTS
1 Introduction to Data Science = 1
 1.1 What is Data Science? = 1
 1.2 About This Book = 3
2 Toolboxes for Data Scientists = 5
 2.1 Introduction = 5
 2.2 Why Python? = 6
 2.3 Fundamental Python Libraries for Data Scientists = 6
  2.3.1 Numeric and Scientific Computation : NumPy and SciPy = 7
  2.3.2 SCIKIT-Learn : Machine Learning in Python = 7
  2.3.3 PANDAS : Python Data Analysis Library = 7
 2.4 Data Science Ecosystem Installation = 7
 2.5 Integrated Development Environments (IDE) = 8
  2.5.1 Web Integrated Development Environment (WIDE) : Jupyter = 9
 2.6 Get Started with Python for Data Scientists = 10
  2.6.1 Reading = 14
  2.6.2 Selecting Data = 16
  2.6.3 Filtering Data = 17
  2.6.4 Filtering Missing Values = 17
  2.6.5 Manipulating Data = 18
  2.6.6 Sorting = 22
  2.6.7 Grouping Data = 23
  2.6.8 Rearranging Data = 24
  2.6.9 Ranking Data = 25
  2.6.10 Plotting = 26
 2.7 Conclusions = 28
3 Descriptive Statistics = 29
 3.1 Introduction = 29
 3.2 Data Preparation = 30
  3.2.1 The Adult Example = 30
 3.3 Exploratory Data Analysis = 32
  3.3.1 Summarizing the Data = 32
  3.3.2 Data Distributions = 36
  3.3.3 Outlier Treatment = 38
  3.3.4 Measuring Asymmetry : Skewness and Pearson''''s Median Skewness Coefficient = 41
  3.3.5 Continuous Distribution = 42
  3.3.6 Kernel Density = 44
 3.4 Estimation = 46
  3.4.1 Sample and Estimated Mean, Variance and Standard Scores = 46
  3.4.2 Covariance, and Pearson''''s and Spearman''''s Rank Correlation = 47
 3.5 Conclusions = 50
  References = 50
4 Statistical Inference = 51
 4.1 Introduction = 51
 4.2 Statistical Inference : The Frequentist Approach = 52
 4.3 Measuring the Variability in Estimates = 52
  4.3.1 Point Estimates = 53
  4.3.2 Confidence Intervals = 56
 4.4 Hypothesis Testing = 59
  4.4.1 Testing Hypotheses Using Confidence Intervals = 60
  4.4.2 Testing Hypotheses Using p-Values = 61
 4.5 But Is the Effect E Real? = 64
 4.6 Conclusions = 64
  References = 65
5 Supervised Learning = 67
 5.1 Introduction = 67
 5.2 The Problem = 68
 5.3 First Steps = 69
 5.4 What Is Learning? = 78
 5.5 Learning Curves = 79
 5.6 Training, Validation and Test = 82
 5.7 Two Learning Models = 86
  5.7.1 Generalities Concerning Learning Models = 86
  5.7.2 Support Vector Machines = 87
  5.7.3 Random Forest = 90
 5.8 Ending the Learning Process = 91
 5.9 A Toy Business Case = 92
 5.10 Conclusion = 95
  Reference = 96
6 Regression Analysis = 97
 6.1 Introduction = 97
 6.2 Linear Regression = 98
  6.2.1 Simple Linear Regression = 98
  6.2.2 Multiple Linear Regression and Polynomial Regression = 103
 6.2.3 Sparse Model = 104
 6.3 Logistic Regression = 110
 6.4 Conclusions = 113
  References = 114
7 Unsupervised Learning = 115
 7.1 Introduction = 115
 7.2 Clustering = 116
  7.2.1 Similarity and Distances = 117
  7.2.2 What Constitutes a Good Clustering? Defining Metrics to Measure Clustering Quality = 117
  7.2.3 Taxonomies of Clustering Techniques = 120
 7.3 Case Study = 132
 7.4 Conclusions = 138
  References = 139
8 Network Analysis = 141
 8.1 Introduction = 141
 8.2 Basic Definitions in Graphs = 142
 8.3 Social Network Analysis = 144
  8.3.1 Basics in NetworkX = 144
  8.3.2 Practical Case : Facebook Dataset = 145
 8.4 Centrality = 147
  8.4.1 Drawing Centrality in Graphs = 152
  8.4.2 PageRank = 154
 8.5 Ego-Networks = 157
 8.6 Community Detection = 162
 8.7 Conclusions = 163
  References = 164
9 Recommender Systems = 165
 9.1 Introduction = 165
 9.2 How Do Recommender Systems Work? = 166
  9.2.1 Content-Based Filtering = 166
  9.2.2 Collaborative Filtering = 167
  9.2.3 Hybrid Recommenders = 167
 9.3 Modeling User Preferences = 167
 9.4 Evaluating Recommenders = 168
 9.5 Practical Case = 169
  9.5.1 MovieLens Dataset = 169
  9.5.2 User-Based Collaborative Filtering = 171
 9.6 Conclusions = 179
  References = 179
10 Statistical Natural Language Processing for Sentiment Analysis = 181
 10.1 Introduction = 181
 10.2 Data Cleaning = 182
 10.3 Text Representation = 185
  10.3.1 Bi-Grams and n-Grams = 190
 10.4 Practical Cases = 191
 10.5 Conclusions = 196
  References = 196
11 Parallel Computing = 199
 11.1 Introduction = 199
 11.2 Architecture = 200
  11.2.1 Getting Started = 201
  11.2.2 Connecting to the Cluster (The Engines) = 202
 11.3 Multicore Programming = 203
  11.3.1 Direct View of Engines = 203
  11.3.2 Load-Balanced View of Engines = 206
 11.4 Distributed Computing = 207
 11.5 A Real Application : New York Taxi Trips = 208
  11.5.1 A Direct View Non-Blocking Proposal = 209
  11.5.2 Results = 212
 11.6 Conclusions = 214
  References = 215
Index = 217

관련분야 신착자료

윤지선 (2026)
고려대학교. D-HUSS사업단 (2025)
한국일본학회 (2025)