AWS AIF-C01 機器學習基礎篇
Machine Learning Fundamentals 機器學習基礎知識
Training data 訓練數據
ML 模型的好壞取決於用於訓練它的數據。
Labeled data
is a dataset where each instance or example is accompanied by a label or target variable that represents the desired output or classification.
標記數據是一個數據集,其中每個實例或示例都附有一個 標籤 或 目標變數,該變數表示所需的 輸出 或 分類。
Unlabeled data
instances or examples do not have any associated labels or target variables. The data consists only of input features, without any corresponding output or classification.
未標記的數據是實例或示例 沒有任何關聯標籤或目標變數 的數據集。數據僅包含輸入特徵,沒有任何相應的輸出或分類。
Structured data
Tabular data(表格數據)
data stored in spreadsheets, databases, or CSV files, with rows representing instances and columns representing features or attributes.
存儲在電子表格、資料庫或 CSV 檔中的數據,其中 行表示實例,列表示特徵或屬性。
Time-series data(時間序列數據)
consists of sequences of values measured at successive points in time, such as stock prices, sensor readings, or weather data.
連續時間點測量的值序列組成,例如股票價格、感測器讀數或天氣數據。
Machine learning process 機器學習過程
traditionally divided into three broad categories: supervised learning, unsupervised learning, and reinforcement learning.
Supervised learning(監督式學習)
trained on labeled data, learn a mapping function that can predict the output for new, unseen input data
在標記數據上訓練的。目標是學習一個映射函數,該函數可以預測新的、看不見的輸入數據的輸出。
Types of supervised ML
- Classification(分類): assign labels or categories to new
- Regression(回歸): predicting continuous or numerical values based on one or more input variable
Unsupervised learning(無監督學習)
learn from unlabeled
data, discover inherent patterns, structures, or relationships within the input data
從未標記數據中學習的演算法, 目標是發現輸入數據中的固有模式、結構或關係
Types of Unsupervised learning
- clustering(聚類): roups data into different clusters based on similar features or distances
- Dimensionality reduction(降維): reduce the number of features or dimensions
Reinforcement learning(強化學習)
machine is given only a performance score as guidance and semi-supervised learning, Feedback is provided in the form of rewards or penalties for its actions, improve its decision-making over time
機器只得到一個性能分數作為指導,反饋以獎勵或懲罰的形式提供,機器從這些反饋中學習, 隨時間的推移改進其決策。
Inferencing (推理)
After the model has been trained, it is time to begin the process of using the information that a model has learned to make predictions or decisions. This is called inferencing.
使用模型學到的信息進行預測或決策, 稱為推理。
Batch inferencing(批量推理)
the computer takes a large amount of data, analyzes it all at once to provide a set of results 獲取大量數據, 並一次對其進行分析以提供一組結果
Real-time inferencing (即時推理)
make decisions quickly, in response to new information, such as in chatbots or self-driving cars. 快速做出決策,以回應傳入的新資訊, 例如聊天機器人或自動駕駛汽車。