国产无码综合区,色欲AV无码国产永久播放,无码天堂亚洲国产AV,国产日韩欧美女同一区二区

Python綜合數(shù)據(jù)分析_RFM用戶分層模型

2年前作者：you_are_my_sunshine*分類：Toy博客閱讀(30)違法舉報

這篇具有很好參考價值的文章主要介紹了Python綜合數(shù)據(jù)分析_RFM用戶分層模型。希望對大家有所幫助。如果存在錯誤或未考慮完全的地方，請大家不吝賜教，您也可以點擊"舉報違法"按鈕提交疑問。

1.數(shù)據(jù)加載

import pandas as pd
dataset = pd.read_csv('SupplyChain.csv', encoding='unicode_escape')
dataset

Python綜合數(shù)據(jù)分析_RFM用戶分層模型,Python基礎,python,數(shù)據(jù)分析

2.查看數(shù)據(jù)情況

print(dataset.shape)
print(dataset.isnull().sum())

Python綜合數(shù)據(jù)分析_RFM用戶分層模型,Python基礎,python,數(shù)據(jù)分析

3.數(shù)據(jù)合并及填充

print(dataset[['Customer Fname', 'Customer Lname']])
#  fistname與lastname進行合并
dataset['Customer Full Name'] = dataset['Customer Fname'] +dataset['Customer Lname']
#dataset.head()
dataset['Customer Zipcode'].value_counts()
# 查看缺失值，發(fā)現(xiàn)有3個缺失值
print(dataset['Customer Zipcode'].isnull().sum())

Python綜合數(shù)據(jù)分析_RFM用戶分層模型,Python基礎,python,數(shù)據(jù)分析

dataset['Customer Zipcode'] = dataset['Customer Zipcode'].fillna(0)
dataset.head()

Python綜合數(shù)據(jù)分析_RFM用戶分層模型,Python基礎,python,數(shù)據(jù)分析

4.查看特征字段之間相關性

import matplotlib.pyplot as plt
import seaborn as sns
# 特征字段之間相關性 熱力圖
data = dataset
plt.figure(figsize=(20,10))
# annot=True 顯示具體數(shù)字
sns.heatmap(data.corr(), annot=True, cmap='coolwarm')
# 結論：可以觀察到Product Price和Sales，Order Item Total有很高的相關性

Python綜合數(shù)據(jù)分析_RFM用戶分層模型,Python基礎,python,數(shù)據(jù)分析

5.聚合操作

# 基于Market進行聚合
market = data.groupby('Market')
# 基于Region進行聚合
region = data.groupby('Order Region')
plt.figure(1)
market['Sales per customer'].sum().sort_values(ascending=False).plot.bar(figsize=(12,6), title='Sales in different markets')
plt.figure(2)
region['Sales per customer'].sum().sort_values(ascending=False).plot.bar(figsize=(12,6), title='Sales in different regions')
plt.show()

Python綜合數(shù)據(jù)分析_RFM用戶分層模型,Python基礎,python,數(shù)據(jù)分析

# 基于Category Name進行聚類
cat = data.groupby('Category Name')
plt.figure(1)
# 不同類別的 總銷售額
cat['Sales per customer'].sum().sort_values(ascending=False).plot.bar(figsize=(12,6), title='Total sales')
plt.figure(2)
# 不同類別的 平均銷售額
cat['Sales per customer'].mean().sort_values(ascending=False).plot.bar(figsize=(12,6), title='Total sales')
plt.show()

Python綜合數(shù)據(jù)分析_RFM用戶分層模型,Python基礎,python,數(shù)據(jù)分析

6.時間維度上看銷售額

#data['order date (DateOrders)']
# 創(chuàng)建時間戳索引
temp = pd.DatetimeIndex(data['order date (DateOrders)'])
temp

Python綜合數(shù)據(jù)分析_RFM用戶分層模型,Python基礎,python,數(shù)據(jù)分析

# 取order date (DateOrders)字段中的year, month, weekday, hour, month_year
data['order_year'] = temp.year
data['order_month'] = temp.month
data['order_week_day'] = temp.weekday
data['order_hour'] = temp.hour
data['order_month_year'] = temp.to_period('M')
data.head()

Python綜合數(shù)據(jù)分析_RFM用戶分層模型,Python基礎,python,數(shù)據(jù)分析

# 對銷售額進行探索，按照不同時間維度 年，星期，小時，月
plt.figure(figsize=(10, 12))
plt.subplot(4, 2, 1)
df_year = data.groupby('order_year')
df_year['Sales'].mean().plot(figsize=(12, 12), title='Average sales in years')
plt.subplot(4, 2, 2)
df_day = data.groupby('order_week_day')
df_day['Sales'].mean().plot(figsize=(12, 12), title='Average sales in days per week')
plt.subplot(4, 2, 3)
df_hour = data.groupby('order_hour')
df_hour['Sales'].mean().plot(figsize=(12, 12), title='Average sales in hours per day')
plt.subplot(4, 2, 4)
df_month = data.groupby('order_month')
df_month['Sales'].mean().plot(figsize=(12, 12), title='Average sales in month per year')
plt.tight_layout()
plt.show()

Python綜合數(shù)據(jù)分析_RFM用戶分層模型,Python基礎,python,數(shù)據(jù)分析

# 探索商品價格與 銷售額之間的關系
data.plot(x='Product Price', y='Sales per customer') 
plt.title('Relationship between Product Price and Sales per customer')
plt.xlabel('Product Price')
plt.ylabel('Sales per customer')
plt.show()

Python綜合數(shù)據(jù)分析_RFM用戶分層模型,Python基礎,python,數(shù)據(jù)分析

7.計算用戶RFM

# # 用戶分層 RFM
data['TotalPrice'] = data['Order Item Quantity'] * data['Order Item Total']
data[['TotalPrice', 'Order Item Quantity', 'Order Item Total']]

Python綜合數(shù)據(jù)分析_RFM用戶分層模型,Python基礎,python,數(shù)據(jù)分析

# 時間類型轉(zhuǎn)換
data['order date (DateOrders)'] = pd.to_datetime(data['order date (DateOrders)'])
# 統(tǒng)計最后一筆訂單的時間
data['order date (DateOrders)'].max()

Python綜合數(shù)據(jù)分析_RFM用戶分層模型,Python基礎,python,數(shù)據(jù)分析

# 假設我們現(xiàn)在是2018-2-1
import datetime
present = datetime.datetime(2018,2,1)
# 計算每個用戶的RFM指標
# 按照Order Customer Id進行聚合，
customer_seg = data.groupby('Order Customer Id').agg({'order date (DateOrders)': lambda x: (present-x.max()).days,                                                       'Order Id': lambda x:len(x), 'TotalPrice': lambda x: x.sum()})
customer_seg

Python綜合數(shù)據(jù)分析_RFM用戶分層模型,Python基礎,python,數(shù)據(jù)分析

# 將字段名稱改成 R，F(xiàn)，M
customer_seg.rename(columns={'order date (DateOrders)': 'R_Value', 'Order Id': 'F_Value', 'TotalPrice': 'M_Value'}, inplace=True)
customer_seg.head()

Python綜合數(shù)據(jù)分析_RFM用戶分層模型,Python基礎,python,數(shù)據(jù)分析

# 將RFM數(shù)據(jù)劃分為4個尺度
quantiles = customer_seg.quantile(q=[0.25, 0.5, 0.75])
quantiles = quantiles.to_dict()
quantiles

Python綜合數(shù)據(jù)分析_RFM用戶分層模型,Python基礎,python,數(shù)據(jù)分析

# R_Value越小越好 => R_Score就越大
def R_Score(a, b, c):
    if a <= c[b][0.25]:
        return 4
    elif a <= c[b][0.50]:
        return 3
    elif a <= c[b][0.75]:
        return 2
    else:
        return 1

# F_Value, M_Value越大越好
def FM_Score(a, b, c):
    if a <= c[b][0.25]:
        return 1
    elif a <= c[b][0.50]:
        return 2
    elif a <= c[b][0.75]:
        return 3
    else:
        return 4

# 新建R_Score字段，用于將R_Value => [1,4]
customer_seg['R_Score']  = customer_seg['R_Value'].apply(R_Score, args=("R_Value", quantiles))
# 新建F_Score字段，用于將F_Value => [1,4]
customer_seg['F_Score']  = customer_seg['F_Value'].apply(FM_Score, args=("F_Value", quantiles))
# 新建M_Score字段，用于將R_Value => [1,4]
customer_seg['M_Score']  = customer_seg['M_Value'].apply(FM_Score, args=("M_Value", quantiles))
customer_seg.head()

Python綜合數(shù)據(jù)分析_RFM用戶分層模型,Python基礎,python,數(shù)據(jù)分析

# 計算RFM用戶分層
def RFM_User(df):
    if df['M_Score'] > 2 and df['F_Score'] > 2 and df['R_Score'] > 2:
        return '重要價值用戶'
    if df['M_Score'] > 2 and df['F_Score'] <= 2 and df['R_Score'] > 2:
        return '重要發(fā)展用戶'
    if df['M_Score'] > 2 and df['F_Score'] > 2 and df['R_Score'] <= 2:
        return '重要保持用戶'
    if df['M_Score'] > 2 and df['F_Score'] <= 2 and df['R_Score'] <= 2:
        return '重要挽留用戶'

    if df['M_Score'] <= 2 and df['F_Score'] > 2 and df['R_Score'] > 2:
        return '一般價值用戶'
    if df['M_Score'] <= 2 and df['F_Score'] <= 2 and df['R_Score'] > 2:
        return '一般發(fā)展用戶'
    if df['M_Score'] <= 2 and df['F_Score'] > 2 and df['R_Score'] <= 2:
        return '一般保持用戶'
    if df['M_Score'] <= 2 and df['F_Score'] <= 2 and df['R_Score'] <= 2:
        return '一般挽留用戶'

customer_seg['Customer_Segmentation'] = customer_seg.apply(RFM_User, axis=1)
customer_seg

Python綜合數(shù)據(jù)分析_RFM用戶分層模型,Python基礎,python,數(shù)據(jù)分析

8.數(shù)據(jù)保存存儲

(1).to_csv

customer_seg.to_csv('supply_chain_rfm_result.csv', index=False)

(1).to_pickle

# 數(shù)據(jù)預處理后，將處理后的數(shù)據(jù)進行保存
data.to_pickle('data.pkl')

參考資料：開課吧文章來源地址http://www.zghlxwxcb.cn/news/detail-790409.html

到了這里，關于Python綜合數(shù)據(jù)分析_RFM用戶分層模型的文章就介紹完了。如果您還想了解更多內(nèi)容，請在右上角搜索TOY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關文章，希望大家以后多多支持TOY模板網(wǎng)！

本文來自互聯(lián)網(wǎng)用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。如若轉(zhuǎn)載，請注明出處：如若內(nèi)容造成侵權/違法違規(guī)/事實不符，請點擊違法舉報進行投訴反饋，一經(jīng)查實，立即刪除！

分享到：

領支付寶紅包贊助服務器費用

基于新浪微博海量用戶行為數(shù)據(jù)、博文數(shù)據(jù)數(shù)據(jù)分析：包括綜合指數(shù)、移動指數(shù)、PC指數(shù)三個指數(shù)
項目介紹微指數(shù)是基于海量用戶行為數(shù)據(jù)、博文數(shù)據(jù)，采用科學計算方法統(tǒng)計得出的反映不同事件領域發(fā)展狀況的指數(shù)產(chǎn)品。微指數(shù)對于收錄的，在指數(shù)方面提供微博數(shù)據(jù)層面的指數(shù)數(shù)據(jù)，包括綜合指數(shù)、移動指數(shù)、PC指數(shù)三個指數(shù)。項目舉例以‘中興’這一
2024年02月14日
瀏覽(21)
python數(shù)據(jù)分析案例——天貓訂單綜合分析
前言大家早好、午好、晚好吖 ? ~歡迎光臨本文章什么是數(shù)據(jù)分析明確目的–獲得數(shù)據(jù)(爬蟲，現(xiàn)有，公開的數(shù)據(jù))–數(shù)據(jù)預處理——數(shù)據(jù)可視化——結論準備環(huán)境使用：在開始寫我們的代碼之前，我們要準備好運行代碼的程序 Anaconda (python3.9) – 識別我們寫的代碼開發(fā)工
2024年02月03日
瀏覽(34)
Python綜合案例-小費數(shù)據(jù)集的數(shù)據(jù)分析(詳細思路+源碼解析)
目錄 1. 請導入相應模塊并獲取數(shù)據(jù)。導入待處理數(shù)據(jù)tips.xls，并顯示前5行。 2、分析數(shù)據(jù) ?3.增加一列“人均消費” 4查詢抽煙男性中人均消費大于5的數(shù)據(jù) ?5.分析小費金額和消費總額的關系，小費金額與消費總額是否存在正相關關系。畫圖觀察。 6分析男女顧客哪個更慷慨，
2024年02月02日
瀏覽(27)
數(shù)據(jù)分析案例-基于snownlp模型的MatePad11產(chǎn)品用戶評論情感分析
???♂? 個人主頁：@艾派森的個人主頁 ???作者簡介：Python學習者 ?? 希望大家多多支持，我們一起進步！?? 如果文章對你有幫助的話，歡迎評論 ??點贊???? 收藏 ??加關注+ 目錄一、項目介紹二、數(shù)據(jù)集介紹三、實驗過程 3.1導入數(shù)據(jù)（） ?3.2數(shù)據(jù)預處理 3.3數(shù)據(jù)
2024年02月07日
瀏覽(22)
用戶消費數(shù)據(jù)分析，基于python
? 目錄一、數(shù)據(jù)的類型處理 1.1 數(shù)據(jù)加載 ?1.2 觀察數(shù)據(jù) 二、按月數(shù)據(jù)分析 2.1 用戶每月花費的總金額 2.2 所有用戶每月的產(chǎn)品購買量 2.3 所有用戶每月的消費總次數(shù) 2.4 統(tǒng)計每月的消費人數(shù) 三、用戶個體消費數(shù)據(jù)分析 3.1 用戶消費總金額和消費總次數(shù)的統(tǒng)計描述 3.2 用戶消費金
2024年02月07日
瀏覽(17)
Python大數(shù)據(jù)-對淘寶用戶的行為數(shù)據(jù)分析
import pandas as pd import numpy as np import matplotlib.pyplot as plt import os data.shape[0] 總流量為12256906，在計算一下日平均流量、日平均獨立訪客數(shù) ##日PV pv_daily = data.groupby([‘date’])[‘user_id’].count().reset_index().rename(columns={‘user_id’:‘pv_daily’}) pv_daily.head() 日平均獨立訪客數(shù)與日平均流
2024年04月25日
瀏覽(25)
python大數(shù)據(jù)B站網(wǎng)站用戶數(shù)據(jù)情感分析
文章目錄 0 前言+ 1 B站整體視頻數(shù)據(jù)分析+ 1.1 數(shù)據(jù)預處理+ 1.2 數(shù)據(jù)可視化+ 1.3 分析結果 2 單一視頻分析+ 2.1 數(shù)據(jù)預處理+ 2.2 數(shù)據(jù)清洗+ 2.3 數(shù)據(jù)可視化 3 文本挖掘（NLP）+ 3.1 情感分析 4 最后這兩年開始，各個學校對畢設的要求越來越高，難度也越來越大… 畢業(yè)設計耗費時間，耗
2024年02月03日
瀏覽(14)
大數(shù)據(jù)畢設項目 - 大數(shù)據(jù)電商用戶行為分析 -python 大數(shù)據(jù)
?? 這兩年開始畢業(yè)設計和畢業(yè)答辯的要求和難度不斷提升，傳統(tǒng)的畢設題目缺少創(chuàng)新和亮點，往往達不到畢業(yè)答辯的要求，這兩年不斷有學弟學妹告訴學長自己做的項目系統(tǒng)達不到老師的要求。為了大家能夠順利以及最少的精力通過畢設，學長分享優(yōu)質(zhì)畢業(yè)設計項目，今天
2024年03月17日
瀏覽(28)
python畢設大數(shù)據(jù)用戶畫像分析系統(tǒng)(源碼分享)
Hi，大家好，這里是丹成學長，今天做一個電商銷售預測分析，這只是一個demo，嘗試對電影數(shù)據(jù)進行分析，并可視化系統(tǒng) ?? 這兩年開始畢業(yè)設計和畢業(yè)答辯的要求和難度不斷提升，傳統(tǒng)的畢設題目缺少創(chuàng)新和亮點，往往達不到畢業(yè)答辯的要求，這兩年不斷有學弟學妹告訴學
2024年01月17日
瀏覽(22)
【畢業(yè)設計】大數(shù)據(jù)B站用戶數(shù)據(jù)情感分析系統(tǒng) - python
?? Hi，大家好，這里是丹成學長的畢設系列文章！ ?? 對畢設有任何疑問都可以問學長哦! 這兩年開始，各個學校對畢設的要求越來越高，難度也越來越大… 畢業(yè)設計耗費時間，耗費精力，甚至有些題目即使是專業(yè)的老師或者碩士生也需要很長時間，所以一旦發(fā)現(xiàn)問題，一定
2023年04月12日
瀏覽(25)