国产无码综合区,色欲AV无码国产永久播放,无码天堂亚洲国产AV,国产日韩欧美女同一区二区

<form id="qunck"></form><span id="qunck"></span>

<rt id="qunck"></rt>

<center id="qunck"></center>

?? ?? ?? 爆：使用ChatGPT+Streamlit快速構(gòu)建機器學(xué)習(xí)數(shù)據(jù)集劃分應(yīng)用程序?。?！

2年前作者：AI算法蔣同學(xué)分類：Toy博客閱讀(47)違法舉報

這篇具有很好參考價值的文章主要介紹了?? ?? ?? 爆：使用ChatGPT+Streamlit快速構(gòu)建機器學(xué)習(xí)數(shù)據(jù)集劃分應(yīng)用程序?。?！。希望對大家有所幫助。如果存在錯誤或未考慮完全的地方，請大家不吝賜教，您也可以點擊"舉報違法"按鈕提交疑問。

?? ?? ?? 爆：使用ChatGPT+Streamlit快速構(gòu)建機器學(xué)習(xí)數(shù)據(jù)集劃分應(yīng)用程序！?。?/h2>
ChatGPT 對于 Python 程序員有用嗎？特別是我們這些使用 Python 進行數(shù)據(jù)處理、數(shù)據(jù)清洗和構(gòu)建機器學(xué)習(xí)模型的人？我們試試就知道了。

來自 OpenAI 的 ChatGPT 是什么？假設(shè)你已經(jīng)知道了。網(wǎng)上鋪天蓋地的宣傳吶喊，我想不再需要介紹了。加入您碰巧不知道 ChatGPT 是什么，趕快去查查，不然會被人嘲笑的，然后再回來繼續(xù)。

與 ChatGPT 的對話可以是有趣的，甚至是有啟發(fā)性的。但是讓我們看看它實際上有多大幫助。

讓我們看看 ChatGPT 在根據(jù)我們提供的規(guī)范編寫代碼時能做些什么。我們將從簡單開始，并從那里開始構(gòu)建。

一、簡單數(shù)據(jù)集處理腳本

我認為最好從某種數(shù)據(jù)處理腳本開始。因為我打算稍后讓 ChatGPT 使用各種 Python 庫編寫一些機器學(xué)習(xí)腳本，所以這似乎是一個合理的起點。

1.目標

首先，我想嘗試加載一個數(shù)據(jù)集; 確保所有數(shù)據(jù)集值都是數(shù)值，或者將它們轉(zhuǎn)換為數(shù)值; 檢查缺失的值; 并將數(shù)據(jù)集分割為訓(xùn)練集和測試集。

2.用戶提示

Write a Python script that does the following:
- loads a dataset from csv on disk to Pandas DataFrame
- ensures that dataset is entirely numeric; converts string values to numeric and tracks this conversion in a dictionary to reference later
- checks for missing values in the DataFrame and imputes missing values for any record that has one missing value; entirely drops a record that has more than one missing value
- splits the dataset into test and training sets

編寫一個 Python 腳本，執(zhí)行以下操作:

- 將磁盤上的 csv 數(shù)據(jù)集加載到 Pandas DataFrame

- 確保數(shù)據(jù)集完全是數(shù)字; 將字符串值轉(zhuǎn)換為數(shù)字，并在字典中跟蹤這種轉(zhuǎn)換，以便以后引用

- 檢查 DataFrame 中是否缺少值，并為缺少一個值的任何記錄計算缺少值; 完全刪除一個缺少多個值的記錄

- 將數(shù)據(jù)集分成測試集和訓(xùn)練集

3.GhatGPT返回代碼

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split

# Load the dataset from csv to a Pandas DataFrame
df = pd.read_csv("dataset.csv")

# Ensure the dataset is entirely numeric
# Convert string values to numeric and track this conversion in a dictionary to reference later
string_columns = df.columns[df.dtypes == 'object']
string_to_numeric = {}
for column in string_columns:
    unique_values = df[column].unique()
    mapping = {value: i for i, value in enumerate(unique_values)}
    string_to_numeric[column] = mapping
    df[column] = df[column].map(mapping)

# Check for missing values in the DataFrame
missing_values = df.isna().sum()

# Impute missing values for any record that has one missing value; entirely drop a record that has more than one missing value
for column, missing_value_count in missing_values.items():
    if missing_value_count == 1:
        df[column].fillna(df[column].mean(), inplace=True)
    elif missing_value_count > 1:
        df.dropna(subset=[column], inplace=True)

# Split the dataset into test and training sets
train_data, test_data, train_target, test_target = train_test_split(df.drop('target', axis=1), df['target'], test_size=0.2, random_state=0)

注意: 此腳本假定目標變量存儲在 DataFrame 中名為 target 的列中。您需要根據(jù)數(shù)據(jù)集的結(jié)構(gòu)修改此代碼。

4.結(jié)果分析

ChatGPT 生成了一個有用的腳本。它可以加載一個data.csv的數(shù)據(jù)集，以及訓(xùn)練集/測試集分割的比例，并記錄了這兩點以及如果需要的話如何更改它們?？偟膩碚f，任務(wù)完成了。

二、重構(gòu)我們的代碼

1.目標

現(xiàn)在我們已經(jīng)有了一個執(zhí)行某些操作的腳本，讓我們對它進行一些重構(gòu)。良好編碼的基礎(chǔ)是使用函數(shù)來組織邏輯代碼，促進重用，并確保您不會重復(fù)。

2.用戶提示

Rewrite the above script to use a series of functions that make sense

重寫上面的腳本，使用一系列有意義的函數(shù)

3.GhatGPT返回代碼

Here’s the script refactored into a series of functions that make sense:

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split

def load_dataset(filepath):
    """Loads a dataset from a csv file to a Pandas DataFrame"""
    return pd.read_csv(filepath)

def convert_to_numeric(df):
    """Ensures the dataset is entirely numeric and converts string values to numeric"""
    string_columns = df.columns[df.dtypes == 'object']
    string_to_numeric = {}
    for column in string_columns:
        unique_values = df[column].unique()
        mapping = {value: i for i, value in enumerate(unique_values)}
        string_to_numeric[column] = mapping
        df[column] = df[column].map(mapping)
    return df, string_to_numeric

def handle_missing_values(df):
    """Handles missing values by imputing a mean for records with one missing value, and dropping records with more than one missing value"""
    missing_values = df.isna().sum()
    for column, missing_value_count in missing_values.items():
        if missing_value_count == 1:
            df[column].fillna(df[column].mean(), inplace=True)
        elif missing_value_count > 1:
            df.dropna(subset=[column], inplace=True)
    return df

def split_dataset(df, target_column, test_size=0.2, random_state=0):
    """Splits the dataset into test and training sets"""
    train_data, test_data, train_target, test_target = train_test_split(df.drop(target_column, axis=1), df[target_column], test_size=test_size, random_state=random_state)
    return train_data, test_data, train_target, test_target

# Load the dataset from csv to a Pandas DataFrame
df = load_dataset("dataset.csv")

# Convert string values to numeric and handle missing values
df, string_to_numeric = convert_to_numeric(df)
df = handle_missing_values(df)

# Split the dataset into test and training sets
train_data, test_data, train_target, test_target = split_dataset(df, "target")

Note: This script assumes that the target variable is stored in a column named target in the DataFrame. You’ll need to modify this code based on the structure of your dataset.

在這里，ChatGPT使用了四個函數(shù):

1.load_dataset(filepath) :加載 csv 數(shù)據(jù)集

2.convert_to_numeric(df) :將字符串值轉(zhuǎn)換為數(shù)值。

3.handle_missing_values(df):處理數(shù)據(jù)集中缺少的值。

4.split_dataset(df, target_column, test_size=0.2, random_state=0) :將數(shù)據(jù)集劃分為測試集和訓(xùn)練集。

測試樣例使用這些函數(shù)來加載、轉(zhuǎn)換和清理數(shù)據(jù)集，然后將其拆分為測試集和訓(xùn)練集。

數(shù)據(jù)集的路徑和測試比例是這些函數(shù)的輸入。

4.結(jié)果分析

我們從 ChatGPT 的第一個腳本可以看出一些非常明顯的邏輯代碼塊，可以將這些代碼塊按功能封裝成函數(shù)。ChatGPT 做了我期望它做的事情。同樣，生成了一個合格的腳本。

三、擴展我們的代碼

1.目標

我們的腳本現(xiàn)在看起來好看點了，因為它使用了一些函數(shù)。接下來，讓我們稍微擴展一下腳本，以便用戶可以在運行腳本時指定一些參數(shù)。

2.用戶提示

Rewrite the above program to solicit the following from the command line in the form of arguments, and use them in the program:
- the path/name of the dataset csv file
- the train/test split size
- the threshold for the number of missing values per record to drop after which to drop entire record, instead of imputing the missing values

重寫上面的程序，以參數(shù)的形式從命令行獲取以下內(nèi)容，并在程序中使用它們:

- 數(shù)據(jù)集 csv 文件的路徑/名稱

- 訓(xùn)練集/測試集的分割比例

- 每條記錄丟失值數(shù)量的閾值，在該閾值之后刪除整條記錄，而不是輸入丟失值

3.GhatGPT返回代碼

Here’s the script refactored to solicit the arguments from the command line:

import pandas as pd
import numpy as np
import argparse
from sklearn.model_selection import train_test_split

def load_dataset(filepath):
    """Loads a dataset from a csv file to a Pandas DataFrame"""
    return pd.read_csv(filepath)

def convert_to_numeric(df):
    """Ensures the dataset is entirely numeric and converts string values to numeric"""
    string_columns = df.columns[df.dtypes == 'object']
    string_to_numeric = {}
    for column in string_columns:
        unique_values = df[column].unique()
        mapping = {value: i for i, value in enumerate(unique_values)}
        string_to_numeric[column] = mapping
        df[column] = df[column].map(mapping)
    return df, string_to_numeric

def handle_missing_values(df, missing_value_threshold):
    """Handles missing values by imputing a mean for records with one missing value, and dropping records with more than one missing value"""
    missing_values = df.isna().sum()
    for column, missing_value_count in missing_values.items():
        if missing_value_count <= missing_value_threshold:
            df[column].fillna(df[column].mean(), inplace=True)
        elif missing_value_count > missing_value_threshold:
            df.dropna(subset=[column], inplace=True)
    return df

def split_dataset(df, target_column, test_size=0.2, random_state=0):
    """Splits the dataset into test and training sets"""
    train_data, test_data, train_target, test_target = train_test_split(df.drop(target_column, axis=1), df[target_column], test_size=test_size, random_state=random_state)
    return train_data, test_data, train_target, test_target

if __name__ == '__main__':
    # Define the command line arguments
    parser = argparse.ArgumentParser(description='Handles missing values and splits a dataset into test and training sets')
    parser.add_argument('filepath', type=str, help='The path/name of the dataset csv file')
    parser.add_argument('test_size', type=float, help='The test/train split size')
    parser.add_argument('missing_value_threshold', type=int, help='The number of missing values per record after which to drop the entire record, instead of imputing the missing values')
    args = parser.parse_args()

    # Load the dataset from csv to a Pandas DataFrame
    df = load_dataset(args.filepath)

    # Convert string values to numeric and handle missing values
    df, string_to_numeric = convert_to_numeric(df)
    df = handle_missing_values(df, args.missing_value_threshold)

    # Split the dataset into test and training sets
    train_data, test_data, train_target, test_target = split_dataset(df, df.columns[-1], test_size=args.test_size)

This script can be run from the command line with the following command format:

python script_name.py filepath test_size missing_value_threshold

4.結(jié)果分析

老實說，這里不需要太多的分析，因為 ChatGPT 的注釋已經(jīng)很好地解釋了它的功能。他竟然告訴我們該怎么執(zhí)行代碼?。?！

四、將我們的代碼轉(zhuǎn)換為一個 Streamlit 應(yīng)用程序

1.目標

現(xiàn)在，讓假設(shè)我們想讓這段代碼在應(yīng)用程序中使用。讓 ChatGPT 將這些函數(shù)包裝到 Streamlit 應(yīng)用程序中。

2.用戶提示

Rewrite the above program as a Streamlit app, allowing for the user to provide the same arguments as in the command line args above

將上面的程序重寫為 Streamlit 應(yīng)用程序，允許用戶提供與上面命令行 args 中相同的參數(shù)

3.GhatGPT返回代碼

Here’s the script refactored as a Streamlit app:

import streamlit as st
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split

def load_dataset(filepath):
    """Loads a dataset from a csv file to a Pandas DataFrame"""
    return pd.read_csv(filepath)

def convert_to_numeric(df):
    """Ensures the dataset is entirely numeric and converts string values to numeric"""
    string_columns = df.columns[df.dtypes == 'object']
    string_to_numeric = {}
    for column in string_columns:
        unique_values = df[column].unique()
        mapping = {value: i for i, value in enumerate(unique_values)}
        string_to_numeric[column] = mapping
        df[column] = df[column].map(mapping)
    return df, string_to_numeric

def handle_missing_values(df, missing_value_threshold):
    """Handles missing values by imputing a mean for records with one missing value, and dropping records with more than one missing value"""
    missing_values = df.isna().sum()
    for column, missing_value_count in missing_values.items():
        if missing_value_count <= missing_value_threshold:
            df[column].fillna(df[column].mean(), inplace=True)
        elif missing_value_count > missing_value_threshold:
            df.dropna(subset=[column], inplace=True)
    return df

def split_dataset(df, target_column, test_size=0.2, random_state=0):
    """Splits the dataset into test and training sets"""
    train_data, test_data, train_target, test_target = train_test_split(df.drop(target_column, axis=1), df[target_column], test_size=test_size, random_state=random_state)
    return train_data, test_data, train_target, test_target

def main():
    """Main function for the Streamlit app"""
    st.title("Dataset Preprocessing App")

    # Get the file path, test size, and missing value threshold from the user
    file_path = st.text_input("Enter the path/name of the dataset csv file:")
    test_size = st.slider("Enter the test/train split size:", 0.0, 1.0, 0.2)
    missing_value_threshold = st.slider("Enter the number of missing values per record after which to drop the entire record, instead of imputing the missing values:", 0, 100, 1)

    # Load the dataset and convert string values to numeric
    df = load_dataset(file_path)
    df, string_to_numeric = convert_to_numeric(df)

    # Handle missing values
    df = handle_missing_values(df, missing_value_threshold)

    # Split the dataset into test and training sets
    train_data, test_data, train_target, test_target = split_dataset(df, df.columns[-1], test_size=test_size)

    # Show the final processed data
    st.write("Processed Dataset:")
    st.write(df)

    st.write("Train Data:")
    st.write(train_data)
    st.write("Train Target:")
    st.write(train_target)

    st.write("Test Data:")
    st.write(test_data)
    st.write("Test Target:")
    st.write(test_target)
if name == 'main':
	main()

To run the Streamlit app, you can save the above code in a .py file and run it in the command line with the following command:

要運行 Streamlit 應(yīng)用程序，可以將上面的代碼保存在‘.py‘ 文件中，并使用以下命令在命令行中運行它:

streamlit run file_name.py

4.結(jié)果分析

它做了它該做的事。ChatGPT 利用我們的功能，建立了一個簡單的 Streamlit Web應(yīng)用。

五、總結(jié)

完成以上四步之后，我們檢驗下代碼是否可以正常運行，需要做一下幾步操作：

1、安裝streamlit 庫

pip install streamlit

2、安裝 scikit-learn庫

pip install scikit-learn

3、開開心心的執(zhí)行

streamlit run main.py

vscode代碼：
?? ?? ?? 爆：使用ChatGPT+Streamlit快速構(gòu)建機器學(xué)習(xí)數(shù)據(jù)集劃分應(yīng)用程序?。?！
運行的web程序：

搞定！NB class ChatGPT！文章來源地址http://www.zghlxwxcb.cn/news/detail-421529.html

到了這里，關(guān)于?? ?? ?? 爆：使用ChatGPT+Streamlit快速構(gòu)建機器學(xué)習(xí)數(shù)據(jù)集劃分應(yīng)用程序?。。〉奈恼戮徒榻B完了。如果您還想了解更多內(nèi)容，請在右上角搜索TOY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章，希望大家以后多多支持TOY模板網(wǎng)！

本文來自互聯(lián)網(wǎng)用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務(wù)，不擁有所有權(quán)，不承擔(dān)相關(guān)法律責(zé)任。如若轉(zhuǎn)載，請注明出處：如若內(nèi)容造成侵權(quán)/違法違規(guī)/事實不符，請點擊違法舉報進行投訴反饋，一經(jīng)查實，立即刪除！

分享到：

領(lǐng)支付寶紅包贊助服務(wù)器費用

使用ChatGPT構(gòu)建一個AIML聊天機器人是什么體驗
? 使用ChatGPT構(gòu)建一個AIML聊天機器人是什么體驗，使用ChatGPT將C#代碼轉(zhuǎn)換為Swift代碼以實現(xiàn)Swift版的Aiml聊天機器人，AIML（全名為Artificial Intelligence Markup Language）是一種基于XML模式匹配的人工智能標記語言，最早是一個名為\\\"A.L.I.C.E.\\\" (“Artificial Linguistic Internet Computer Entity”)的高
2024年02月11日
瀏覽(19)
如何使用自定義知識庫構(gòu)建自定義ChatGPT機器人
目錄 ??隱藏? 使用自定義數(shù)據(jù)源為您的 ChatGPT 機器人提供數(shù)據(jù) 1. 通過Prompt提示工程提供數(shù)據(jù) 2. 使用 LlamaIndex（GPT 索引）擴展 ChatGPT 如何添加自定義數(shù)據(jù)源先決條件怎么運行的最后的總結(jié) ChatGPT 已成為許多人日常用來自動執(zhí)行各種任務(wù)的不可或缺的工具。如果您已經(jīng)使用
2024年02月08日
瀏覽(27)
使用Streamlit 實現(xiàn)一個聊天機器人界面
效果如下：只需要效果generate_llama2_response 為你的 llm的輸出即可。
2024年03月09日
瀏覽(98)
論文（3）——使用ChatGPT快速提高科研能力?。∪绾慰焖贅?gòu)建代碼？怎么提高自己的科研能力？如何提高自己的生產(chǎn)力？
引言 chatGPT大模型用于問問題和debug，NewBing用于搜索論文，cpolit用于寫代碼…各種各樣的工具層出不窮，之前因為課程和各種瑣事，也就胡亂接觸了一下，并沒有進行系統(tǒng)性的總結(jié)。這不，暑假要做系統(tǒng)性的科研了，也要好好寫代碼了（之前也在好好寫代碼），就需要好好總
2024年02月16日
瀏覽(30)
使用 YOLOv8 和 Streamlit 構(gòu)建實時對象檢測和跟蹤應(yīng)用程序：第 1 部分-介紹和設(shè)置
示例：圖像上的對象檢測實時視頻中的目標檢測和跟蹤是計算機視覺的一個重要領(lǐng)域，在監(jiān)控、汽車和機器人等各個領(lǐng)域都有廣泛的應(yīng)用。由于需要能夠識別和跟蹤對象、確定其位置并對它們進行實時分類的自動化系統(tǒng)，對視頻幀中的實時對象檢測和跟蹤的需求日益增加。
2024年02月17日
瀏覽(28)
使用Gradio Interface構(gòu)建交互式機器學(xué)習(xí)演示
??覺得內(nèi)容不錯的話，歡迎點贊收藏加關(guān)注??????，后續(xù)會繼續(xù)輸入更多優(yōu)質(zhì)內(nèi)容?? ??有問題歡迎大家加關(guān)注私戳或者評論（包括但不限于NLP算法相關(guān)，linux學(xué)習(xí)相關(guān)，讀研讀博相關(guān)......）?? 博主原文鏈接：https://www.yourmetaverse.cn/nlp/231/ （封面圖由文心一格生成）在
2024年02月09日
瀏覽(103)
機器人技能學(xué)習(xí)-構(gòu)建自己的數(shù)據(jù)集并進行訓(xùn)練
若想訓(xùn)練自己的場景，數(shù)據(jù)集的重要性不做過多贅述，下面就基于 robomimic 和 robosuite 構(gòu)建自己的數(shù)據(jù)集進行講解，同時，也會附上 train 和 run 的流程，這樣，就形成了閉環(huán)。采集數(shù)據(jù)可使用腳本 collect_human_demonstrations.py 完成,在采集過程中,需要自己定義 env 的相關(guān)信息,在實際
2024年01月16日
瀏覽(19)
機器學(xué)習(xí)任務(wù)中使用計算圖構(gòu)建和訓(xùn)練模型的流程
The goal is to encourage the project team to think more long-term and not judge success solely based on tactical results. This requires a shift in mindset to prioritize strategic thinking and understanding the bigger picture. It involves reevaluating the current perspective and approach to ensure that decisions and actions align with long-term goals and obje
2024年02月10日
瀏覽(23)
【小白】使用 Amazon SageMaker 構(gòu)建機器學(xué)習(xí)應(yīng)用【附全程部署視頻】
全程部署視頻看這里，原視頻30分鐘左右為了觀看體驗剪掉了等待時間：小白使用Amazon SageMaker 構(gòu)建機器學(xué)習(xí)應(yīng)用 Amazon SageMaker： https://aws.amazon.com/cn/sagemaker/ 輸入名稱、選擇實例類型、配置磁盤大小，具體如下圖創(chuàng)建新角色，選擇任意S3存儲桶，點擊創(chuàng)建角色配置VPC網(wǎng)絡(luò)，選
2023年04月18日
瀏覽(27)
Python小知識 - 【Python】如何使用Pytorch構(gòu)建機器學(xué)習(xí)模型
【Python】如何使用Pytorch構(gòu)建機器學(xué)習(xí)模型機器學(xué)習(xí)是人工智能的一個分支，它的任務(wù)是在已有的數(shù)據(jù)集上學(xué)習(xí)，最終得到一個能夠解決新問題的模型。Pytorch是一個開源的機器學(xué)習(xí)框架，它可以讓我們用更少的代碼構(gòu)建模型，并且可以讓模型訓(xùn)練的過程更加簡單。首先，我們
2024年02月09日
瀏覽(24)