file-type

2020年通信行业中报:5G建设高潮与下游发展趋势

版权申诉

ZIP文件

2.14MB | 更新于2025-08-08 | 37 浏览量 | 0 下载量 举报 收藏
download 限时特惠:#14.90
标题中的“通信行业2020中报总结”暗示了文档是关于2020年通信行业半年度的业绩或市场分析报告。描述与标题相同,重复强调了“Q2 5G建设高峰,未来重点关注下游”的信息。虽然没有具体标签提供,我们可以从标题和描述中提炼出以下几个关键知识点: 1. 通信行业:这是指与信息传递有关的行业,包括但不限于移动通信、数据通信、卫星通信、互联网服务提供商等。这个行业的特点是技术更新迅速,对社会的影响广泛。 2. 2020年中报总结:通常在每个财务年度的中间,公司或行业会发布一份报告,总结上半年的经营情况。中报是投资者、分析师以及相关利益相关者评估行业或公司业绩的重要文件。 3. Q2 5G建设高峰:2020年第二季度(Q2)标志着5G(第五代移动通信技术)建设的高峰期。5G技术具有高速率、低延迟、大连接数等特点,是通信行业的一次重大技术革新。Q2 5G建设高峰表明在此期间,通信行业在5G网络的建设与部署上进行了大量的投资和工作。 4. 未来重点关注下游:在通信行业价值链中,“下游”通常指的是服务提供商、内容提供商、终端用户等。报告提出未来将重点关注下游,意味着通信行业的发展策略和关注点将转向用户体验、服务创新以及5G应用的普及和落地。这包括5G在智能制造、远程医疗、智慧城市、自动驾驶等领域的应用。 由于缺乏具体的标签信息,我们无法提供更详细的分类。不过,从提供的标题和描述中,我们可以推断这份报告将详细分析通信行业2020年上半年的整体表现,特别是5G建设的进展,以及行业未来的发展趋势和机会点。文件名称中提到的“.zip”表明这是一个压缩文件格式,而“通信行业2020中报总结:Q25G建设高峰,未来重点关注下游.pdf”可能是压缩包内包含的文件名称。 这份报告可能包含以下内容: - 2020年上半年通信行业整体业绩概览,包括营收、利润、市场占有率等关键财务指标。 - 5G网络部署和投资情况的具体数据和分析,包括基站建设数量、覆盖区域、投资规模等。 - 5G技术给行业带来的变化和影响,如业务模式创新、新技术的融合应用等。 - 下游市场的现状与预测分析,包括终端用户的需求变化、5G相关产品和服务的发展趋势。 - 挑战与风险分析,包括技术挑战、市场风险、政策环境变化等对行业发展的影响。 - 未来通信行业的发展策略和建议,特别是如何有效推进5G技术的应用和落地。 报告通过数据和分析为行业内部人士、投资者、政策制定者提供决策支持,同时对外部观察者提供了了解通信行业发展现状和未来方向的窗口。

相关推荐

filetype

SET hive.strict.checks.cartesian.product=false; SET hive.mapred.mode=nonstrict; -- 然后执行原SQL WITH user_daily_status AS ( -- 计算每个用户每天的status状态(某天如果存在status=1则该天status=1) SELECT username, p_date, CASE WHEN MAX(`status`) = 1 THEN 1 ELSE 0 END AS daily_status FROM ks_data_factory.sql_completion_perf_user_info_with_generate WHERE p_date >= '{{ ds_nodash - 30 }}' AND p_date <= '{{ ds_nodash }}' GROUP BY username, p_date ), user_status_ratio AS ( -- 计算每个用户近30天开关开启次数占比 SELECT username, COALESCE(SUM(daily_status) * 1.0 / COUNT(*), 0) AS status_ratio FROM user_daily_status GROUP BY username ), user_aggregated AS ( -- 按用户聚合非string字段 SELECT username, std_1st_dep, std_2nd_dep, std_3rd_dep, channel, COALESCE(SUM(sum_request_cnt), 0) AS sum_request_cnt1, COALESCE(SUM(sum_apply_cnt), 0) AS sum_apply_cnt1, COALESCE(SUM(sum_exposure_cnt), 0) AS sum_exposure_cnt1, CASE WHEN SUM(COALESCE(sum_exposure_cnt, 0)) = 0 THEN 0 ELSE SUM(COALESCE(sum_apply_cnt, 0)) / SUM(COALESCE(sum_exposure_cnt, 0)) END AS sum_apply_rate1, COALESCE(SUM(sum_apply_line_cnt), 0) AS sum_apply_line_cnt1, COALESCE(SUM(sum_generate_nows), 0) AS sum_generate_nows1, COALESCE(SUM(sum_new_rows), 0) AS sum_new_rows1, CASE WHEN SUM(COALESCE(sum_new_rows, 0)) = 0 THEN 0 ELSE SUM(COALESCE(sum_generate_nows, 0)) / SUM(COALESCE(sum_new_rows, 0)) END AS generate_rate1 FROM ks_data_factory.sql_completion_perf_user_info_with_generate WHERE p_date >= '{{ ds_nodash - 30 }}' AND p_date <= '{{ ds_nodash }}' GROUP BY username, std_1st_dep, std_2nd_dep, std_3rd_dep, channel ), quantiles AS ( SELECT PERCENTILE_APPROX(ua.sum_exposure_cnt1, 1/3, 9999) AS exposure_q25, PERCENTILE_APPROX(ua.sum_exposure_cnt1, 2/3, 9999) AS exposure_q75, PERCENTILE_APPROX(ua.sum_apply_rate1, 1/3, 9999) AS apply_rate_q25, PERCENTILE_APPROX(ua.sum_apply_rate1, 2/3, 9999) AS apply_rate_q75, PERCENTILE_APPROX(ua.sum_new_rows1, 1/3, 9999) AS new_rows_q25, PERCENTILE_APPROX(ua.sum_new_rows1, 2/3, 9999) AS new_rows_q75, PERCENTILE_APPROX(ua.generate_rate1, 1/3, 9999) AS generate_rate_q25, PERCENTILE_APPROX(ua.generate_rate1, 2/3, 9999) AS generate_rate_q75 FROM user_aggregated ua ), SELECT ua.username, std_1st_dep, std_2nd_dep, std_3rd_dep, channel, ua.sum_request_cnt1, ua.sum_apply_cnt1, ua.sum_exposure_cnt1, ua.sum_apply_rate1, ua.sum_apply_line_cnt1, ua.sum_generate_nows1, ua.sum_new_rows1, ua.generate_rate1, COALESCE(usr.status_ratio, 0) AS status_ratio, -- 输出三分位点计算结果 q.exposure_q25, q.exposure_q75, q.apply_rate_q25, q.apply_rate_q75, q.new_rows_q25, q.new_rows_q75, q.generate_rate_q25, q.generate_rate_q75, -- 曝光次数分类(基于三分位点) CASE WHEN ua.sum_exposure_cnt1 < q.exposure_q25 THEN 'L' WHEN ua.sum_exposure_cnt1 > q.exposure_q75 THEN 'H' ELSE 'M' END AS exposure_category, -- 采纳率分类(基于三分位点) CASE WHEN ua.sum_apply_rate1 < q.apply_rate_q25 THEN 'L' WHEN ua.sum_apply_rate1 > q.apply_rate_q75 THEN 'H' ELSE 'M' END AS apply_rate_category, -- 新增行数分类(基于三分位点) CASE WHEN ua.sum_new_rows1 < q.new_rows_q25 THEN 'L' WHEN ua.sum_new_rows1 > q.new_rows_q75 THEN 'H' ELSE 'M' END AS new_rows_category, -- 生成率分类(基于三分位点) CASE WHEN ua.generate_rate1 < q.generate_rate_q25 THEN 'L' WHEN ua.generate_rate1 > q.generate_rate_q75 THEN 'H' ELSE 'M' END AS generate_rate_category, -- 合并分类字段(RFM模型:基于四个维度三分位点分类) CONCAT( CASE WHEN ua.sum_exposure_cnt1 < q.exposure_q25 THEN 'L' WHEN ua.sum_exposure_cnt1 > q.exposure_q75 THEN 'H' ELSE 'M' END, CASE WHEN ua.sum_apply_rate1 < q.apply_rate_q25 THEN 'L' WHEN ua.sum_apply_rate1 > q.apply_rate_q75 THEN 'H' ELSE 'M' END, CASE WHEN ua.sum_new_rows1 < q.new_rows_q25 THEN 'L' WHEN ua.sum_new_rows1 > q.new_rows_q75 THEN 'H' ELSE 'M' END, CASE WHEN ua.generate_rate1 < q.generate_rate_q25 THEN 'L' WHEN ua.generate_rate1 > q.generate_rate_q75 THEN 'H' ELSE 'M' END ) AS user_segment FROM user_aggregated ua LEFT JOIN user_status_ratio usr ON ua.username = usr.username CROSS JOIN quantiles q; 请分析一下上面这部分代码

filetype

import numpy as np import pandas as pd import pyvista as pv from pykrige.ok3d import OrdinaryKriging3D import os # ===== 1. 数据加载与预处理 ===== def load_data(file_path): df = pd.read_excel(file_path, sheet_name='土壤') mask = ~df['@@甲苯'].isna() & (df['@@甲苯'] >= 0.01) df = df[mask].copy() # 提取坐标和值(假设列名正确) x = df['x'].values.astype(float) y = df['y'].values.astype(float) z = df['Elevation'].values.astype(float) values = df['@@甲苯'].values.astype(float) # IQR过滤极端值 q25, q75 = np.percentile(values, [25, 75]) iqr = q75 - q25 valid_mask = values <= (q75 + 3*iqr) return x[valid_mask], y[valid_mask], z[valid_mask], values[valid_mask] x, y, z, values = load_data('head/Dinglong2.xlsx') # ===== 2. 对数变换 ===== epsilon = 1e-6 log_values = np.log(values + epsilon) # ===== 3. 克里金插值 ===== ok3d = OrdinaryKriging3D( x, y, z, log_values, variogram_model='spherical', variogram_parameters={'nugget': 0.1, 'sill': np.var(log_values), 'range': 100.0}, nlags=20, enable_plotting=False ) # 生成网格(修正扩展逻辑) grid_x = np.linspace(x.min() - 50, x.max() + 50, 50) # 添加缓冲距离50米 grid_y = np.linspace(y.min() - 50, y.max() + 50, 50) grid_z = np.linspace(z.min() - 10, z.max() + 10, 30) # Z轴扩展减少幅度 log_krig, log_var = ok3d.execute('grid', grid_x, grid_y, grid_z) # ===== 4. 反变换与数据清洗 ===== krig_values = np.exp(log_krig + 0.5 * log_var) - epsilon krig_values = np.nan_to_num(krig_values, nan=0.0) # 处理NaN[3](@ref) krig_values = np.clip(krig_values, 0, None) # 确保非负 # ===== 5. 三维网格构建 ===== def create_volume_grid(grid_x, grid_y, grid_z, data): """创建三维网格(修正数据对齐)""" # 强制维度对齐 (nx, ny, nz) data_3d = data.reshape(len(grid_x), len(grid_y), len(grid_z), order='F') grid = pv.RectilinearGrid(grid_x, grid_y, grid_z) grid["浓度"] = data_3d.flatten(order='F') return grid grid = create_volume_grid(grid_x, grid_y, grid_z, krig_values) # 数据验证 print("[数据验证]") print(f"网格点数: {grid.n_points}") print(f"浓度范围: {grid['浓度'].min():.2f} ~ {grid['浓度'].max():.2f}") # 动态设置等值面阈值 if grid['浓度'].max() < 1e-3: raise ValueError("浓度数据异常,请检查克里金参数") else: threshold = np.percentile(grid['浓度'][grid['浓度'] > 0], 75) contour = grid.contour([threshold]) print(f"有效阈值: {threshold:.2f}") # 可视化 plotter = pv.Plotter() try: plotter.add_mesh( contour, cmap='hot', opacity=0.7, clim=[grid['浓度'].min(), grid['浓度'].max()], show_scalar_bar=True ) except ValueError: print("等值面数据为空,启用备用渲染方案") slices = grid.slice_orthogonal() plotter.add_mesh(slices, cmap='hot', opacity=0.7) # 添加钻孔点 points = pv.PolyData(np.column_stack((x, y, z))) points["浓度"] = values plotter.add_mesh( points, scalars="浓度", point_size=10, render_points_as_spheres=True, cmap='hot' ) plotter.show()分析代码,生成的vtk文件np.log转回原数据

filetype

改写为简单代码运行结果不变一致import math from datetime import datetime import numpy as np import os import matplotlib.pyplot as plt import matplotlib.font_manager as fm from collections import Counter # --- 字体和样式配置 --- def configure_fonts_and_styles(): chinese_font_name = None try: if os.name == 'nt': test_font = 'SimSun' try: fm.FontProperties(family=test_font).get_name(); chinese_font_name = test_font except RuntimeError: print(f"警告:Windows系统中未直接找到 {test_font} 字体,尝试 'Microsoft YaHei' 或 'SimHei'。") fallback_fonts = ['Microsoft YaHei', 'SimHei'] for f_name in fallback_fonts: try: fm.FontProperties(family=f_name).get_name(); chinese_font_name = f_name; break except RuntimeError: continue elif os.name == 'posix': common_fonts = ['SimSun', 'Songti SC', 'STSong', 'WenQuanYi Micro Hei', 'Noto Serif CJK SC', 'Noto Sans CJK SC'] for f_name in common_fonts: try: if fm.FontProperties(family=f_name).get_name(): chinese_font_name = f_name; break except RuntimeError: continue if chinese_font_name: plt.rcParams['font.family'] = chinese_font_name print(f"中文字体尝试设置为: {chinese_font_name}") else: print("警告:未能自动配置一个特定的中文字体 (如 SimSun - 宋体)。中文可能无法正确显示。") except Exception as e: print(f"设置中文字体时出错: {e}") try: current_serif = plt.rcParams.get('font.serif', []) if 'Times New Roman' not in current_serif: try: fm.FontProperties(family='Times New Roman').get_name() plt.rcParams['font.serif'] = ['Times New Roman'] + current_serif print(f"英文衬线字体 (serif) 将优先尝试: Times New Roman") except RuntimeError: print("警告:系统中未找到 'Times New Roman' 字体。英文衬线字体可能使用默认设置。") except Exception as e: print(f"设置Times New Roman时出错: {e}") font_size = 14 plt.rcParams['font.size'] = font_size; plt.rcParams['axes.titlesize'] = font_size + 4 plt.rcParams['axes.labelsize'] = font_size + 2; plt.rcParams['xtick.labelsize'] = font_size plt.rcParams['ytick.labelsize'] = font_size; plt.rcParams['legend.fontsize'] = font_size # Adjusted legend fontsize plt.rcParams['figure.titlesize'] = font_size + 6; plt.rcParams['axes.unicode_minus'] = False print(f"全局字体大小已设置为基础: {font_size}pt") # --- 配置项 --- BASE_FILE_PATH = r"C:\Users\byx\Desktop\Dataset\Dataset" CHARGING_DATA_FILE = os.path.join(BASE_FILE_PATH, "Charging_Data.csv") OUTPUT_DIR = "visualizations_layout_adjusted" # 新的输出文件夹 if not os.path.exists(OUTPUT_DIR): os.makedirs(OUTPUT_DIR) CHARGING_COLUMN_MAPPINGS = { 'power': 'Transaction power/kwh', 'start_time': 'Start Time', 'end_time': 'End Time', 'service_fee': 'Service charge/Yuan', 'end_cause': 'end_cause', 'temperature_direct': 'Temperature(℃)', 'district_charging': 'District Name' } # --- CSV读取与解析辅助函数 --- def read_csv(filepath, column_selection_map=None): data = []; header = [] try: with open(filepath, 'r', encoding='utf-8-sig') as f: lines = f.readlines() if not lines: return [], [] header = [h.strip().replace('"', '') for h in lines[0].strip().split(',')] col_indices = {internal_key: header.index(actual_col_name) if actual_col_name in header else -1 for internal_key, actual_col_name in column_selection_map.items()} if column_selection_map else \ {h: i for i, h in enumerate(header)} for line in lines[1:]: if not line.strip(): continue values = [v.strip().replace('"', '') for v in line.split(',')] if len(values) != len(header): continue row_dict = {internal_key: values[idx] if idx != -1 and idx < len(values) else None for internal_key, idx in col_indices.items()} data.append(row_dict) except FileNotFoundError: print(f"错误:文件 {filepath} 未找到。"); return [], [] except Exception as e: print(f"读取CSV文件 {filepath} 时发生错误: {e}"); return [], [] return data, header def parse_datetime_custom(dt_str, context=""): if not dt_str: return None dt_str_cleaned = dt_str.strip() formats = ["%Y-%m-%d %H:%M:%S", "%Y/%m/%d %H:%M", "%m/%d/%Y %H:%M", "%Y-%m-%d %H:%M", "%Y/%m/%d %H:%M:%S", "%Y%m%d", "%Y-%m-%d"] for fmt in formats: try: return datetime.strptime(dt_str_cleaned, fmt) except ValueError: continue return None # --- 统计计算辅助函数 --- def calculate_mean(data_list): valid_data = [x for x in data_list if x is not None and isinstance(x, (int, float))] if not valid_data: return 0.0 return sum(valid_data) / len(valid_data) def calculate_std_dev(data_list, ddof=1): valid_data_float = [] for x in data_list: if x is not None: try: valid_data_float.append(float(x)) except ValueError: pass if len(valid_data_float) < max(2, ddof): return 0.0 mean = calculate_mean(valid_data_float) variance = sum([(x - mean) ** 2 for x in valid_data_float]) / (len(valid_data_float) - ddof) return math.sqrt(variance) def calculate_median(data_list): valid_data = [x for x in data_list if x is not None and isinstance(x, (int, float))] if not valid_data: return 0.0 sorted_list = sorted(valid_data) n = len(sorted_list) mid = n // 2 if n % 2 == 1: return sorted_list[mid] else: return (sorted_list[mid - 1] + sorted_list[mid]) / 2.0 def calculate_pearson_correlation(list1, list2): paired_values = [] for x, y in zip(list1, list2): if x is not None and y is not None: try: val_x = float(x); val_y = float(y) paired_values.append((val_x, val_y)) except ValueError: continue if len(paired_values) < 2: return 0.0 x_vals = [p[0] for p in paired_values]; y_vals = [p[1] for p in paired_values] n = len(x_vals) if n == 0 : return 0.0 mean1, mean2 = calculate_mean(x_vals), calculate_mean(y_vals) numerator = sum([(x - mean1) * (y - mean2) for x, y in zip(x_vals, y_vals)]) sum_sq_diff1 = sum([(x - mean1) ** 2 for x in x_vals]) sum_sq_diff2 = sum([(y - mean2) ** 2 for y in y_vals]) denominator = math.sqrt(sum_sq_diff1 * sum_sq_diff2) if denominator == 0: return 0.0 return numerator / denominator # --- 手动KDE相关函数 --- def gaussian_kernel(u): return (1 / np.sqrt(2 * np.pi)) * np.exp(-0.5 * u**2) def calculate_kde_bandwidth(data_points_np): n = len(data_points_np) if n == 0: return 0.1 std_dev = calculate_std_dev(list(data_points_np), ddof=1) if std_dev == 0: q75, q25 = np.percentile(data_points_np, [75, 25]) iqr = q75 - q25 A = iqr / 1.349 if iqr > 0 else ( (0.01 * (data_points_np.max() - data_points_np.min())) if data_points_np.max() != data_points_np.min() else 0.01 ) if A <= 1e-5 : A = 0.01 return 0.9 * A * (n ** (-0.2)) return 1.06 * std_dev * (n ** (-0.2)) def manual_kde(data_points_np, x_eval_points_np, bandwidth): n_data = len(data_points_np) if n_data == 0 or bandwidth <= 1e-6: return np.zeros_like(x_eval_points_np) data_points_np = np.asarray(data_points_np).reshape(-1, 1) x_eval_points_np = np.asarray(x_eval_points_np).reshape(1, -1) u = (x_eval_points_np - data_points_np) / bandwidth kernel_values = gaussian_kernel(u) kde_y = np.sum(kernel_values, axis=0) / (n_data * bandwidth) return kde_y # --- 数据处理主函数 --- def process_data(): print("--- 正在加载和处理数据 ---") charging_data_raw, _ = read_csv(CHARGING_DATA_FILE, CHARGING_COLUMN_MAPPINGS) if not charging_data_raw: print("错误:充电数据未能加载或为空。"); return [], {} num_raw_charging_records = len(charging_data_raw) cleaned_data = [] power_zero_skipped = 0; parse_errors_time_start = 0; parse_errors_time_end = 0 value_errors_power = 0; time_logic_errors = 0; missing_power=0; missing_start_time=0; missing_end_time=0 value_errors_fee=0; temp_parse_errors=0 for record in charging_data_raw: power_str = record.get('power'); start_time_str = record.get('start_time'); end_time_str = record.get('end_time') service_fee_str = record.get('service_fee'); end_cause_val = record.get('end_cause', "未知原因") temp_direct_str = record.get('temperature_direct'); district_name_val = record.get('district_charging', "未知区域") if power_str is None: missing_power+=1; continue if start_time_str is None: missing_start_time+=1; continue if end_time_str is None: missing_end_time+=1; continue try: transaction_power = float(power_str) except (ValueError, TypeError): value_errors_power+=1; continue if transaction_power == 0: power_zero_skipped += 1; continue start_time = parse_datetime_custom(start_time_str) if not start_time: parse_errors_time_start+=1; continue end_time = parse_datetime_custom(end_time_str) if not end_time: parse_errors_time_end+=1; continue if end_time < start_time: time_logic_errors+=1; continue duration_hours = (end_time - start_time).total_seconds() / 3600.0 try: service_fee = float(service_fee_str) if service_fee_str else 0.0 except (ValueError, TypeError): service_fee = 0.0; value_errors_fee+=1 final_temperature = None if temp_direct_str is not None and temp_direct_str != "": try: final_temperature = float(temp_direct_str) except (ValueError, TypeError): temp_parse_errors+=1 cleaned_data.append({ 'transaction_power': transaction_power, 'duration_hours': duration_hours, 'district_name': district_name_val, 'service_fee': service_fee, 'end_cause': end_cause_val, 'temperature': final_temperature }) print(f"\n数据清洗和转换完成。原始充电记录: {num_raw_charging_records}") if missing_power > 0: print(f" - 因缺少功率数据跳过: {missing_power} 条") print(f" - 因功率为0跳过: {power_zero_skipped} 条") # ... (可以添加其他错误计数打印) print(f"最终得到 {len(cleaned_data)} 条有效记录用于分析。") district_stats_aggregated = {} temp_district_data = {} for record in cleaned_data: district = record['district_name'] if district not in temp_district_data: temp_district_data[district] = {'powers': [], 'durations': [], 'service_fees': []} if record['transaction_power'] is not None: temp_district_data[district]['powers'].append(record['transaction_power']) if record['duration_hours'] is not None: temp_district_data[district]['durations'].append(record['duration_hours']) if record['service_fee'] is not None: temp_district_data[district]['service_fees'].append(record['service_fee']) for district, data in temp_district_data.items(): district_stats_aggregated[district] = { 'avg_power': calculate_mean(data['powers']), 'std_dev_duration': calculate_std_dev(data['durations']), 'median_service_fee': calculate_median(data['service_fees']), 'count': len(data['powers']) } return cleaned_data, district_stats_aggregated # --- 可视化函数 --- def plot_transaction_power_distribution(cleaned_data, output_dir): if not cleaned_data: return transaction_powers_list = [r['transaction_power'] for r in cleaned_data if r['transaction_power'] is not None and r['transaction_power'] > 0] if not transaction_powers_list: print("无有效的正充电量数据用于绘制分布图。"); return transaction_powers_np = np.array(transaction_powers_list) fig, axes = plt.subplots(1, 2, figsize=(16, 7)) # 略微调整figsize # 子图1: 密度直方图与KDE曲线 ax = axes[0] ax.hist(transaction_powers_np, bins=50, density=True, alpha=0.6, color='skyblue', edgecolor='gray', label='密度直方图') if len(transaction_powers_np) > 1: bandwidth = calculate_kde_bandwidth(transaction_powers_np) if bandwidth > 1e-5 : plot_min = transaction_powers_np.min() - 1 * bandwidth plot_max = transaction_powers_np.max() + 1 * bandwidth x_range = np.linspace(max(0, plot_min), plot_max, 500) kde_y = manual_kde(transaction_powers_np, x_range, bandwidth) ax.plot(x_range, kde_y, color='darkviolet', linewidth=2.5, label='KDE曲线') ax.set_title('充电量分布 (直方图与KDE)'); ax.set_xlabel('充电量 (kWh)'); ax.set_ylabel('密度') ax.legend(); ax.grid(axis='y', alpha=0.75); ax.set_xlim(left=0) # 子图2: 箱线图 ax = axes[1] bp = ax.boxplot(transaction_powers_np, vert=False, widths=0.7, patch_artist=True, boxprops=dict(facecolor='lightgreen', color='black'), medianprops=dict(color='red', linewidth=2)) ax.set_title('充电量分布 (箱线图)'); ax.set_xlabel('充电量 (kWh)'); ax.set_yticklabels([]) ax.grid(axis='x', alpha=0.75); ax.set_xlim(left=0) fig.suptitle('充电量 Transaction Power 分布概览', fontweight='bold') fig.tight_layout(rect=[0, 0.05, 1, 0.93]) # rect=[left, bottom, right, top] save_path = os.path.join(output_dir, "transaction_power_distribution_kde.png") plt.savefig(save_path); print(f"充电量分布图 (含KDE) 已保存至: {save_path}") def plot_power_by_end_cause(cleaned_data, output_dir, top_n=5): if not cleaned_data: return end_cause_data = {} for record in cleaned_data: cause = record['end_cause'] if record['end_cause'] else "未知"; power = record['transaction_power'] if power is None: continue if cause not in end_cause_data: end_cause_data[cause] = [] end_cause_data[cause].append(power) if not end_cause_data: print("无有效的充电终止原因数据用于绘图。"); return cause_counts = Counter({cause: len(powers) for cause, powers in end_cause_data.items()}) common_causes_original = [cause for cause, count in cause_counts.most_common(top_n)] # 确保标签不会过长 common_causes = [ (c[:30] + '...' if len(c) > 30 else c) for c in common_causes_original] plot_data = [end_cause_data[cause_orig] for cause_orig in common_causes_original if cause_orig in end_cause_data] # Use original cause for data lookup if not plot_data: print(f"筛选后(前{top_n}个)无充电终止原因数据用于绘图。"); return fig, ax = plt.subplots(figsize=(max(12, top_n * 1.5), 10)) # 动态宽度,增加高度 bp = ax.boxplot(plot_data, vert=True, patch_artist=True, labels=common_causes, widths=0.6) try: # Matplotlib 3.7+ cmap = plt.colormaps.get_cmap('Pastel2') color_list = cmap(np.linspace(0, 1, len(plot_data))) except AttributeError: # Fallback for older Matplotlib cmap = plt.cm.get_cmap('Pastel2', len(plot_data)) color_list = cmap.colors if hasattr(cmap, 'colors') else [cmap(i) for i in np.linspace(0, 1, len(plot_data))] for patch, color in zip(bp['boxes'], color_list): patch.set_facecolor(color) for median in bp['medians']: median.set_color('black'); median.set_linewidth(2) ax.set_title(f'不同充电终止原因下的充电量分布 (Top {top_n})', fontweight='bold') ax.set_ylabel('充电量 (kWh)'); ax.set_xlabel('充电终止原因') tick_label_fontsize = plt.rcParams['xtick.labelsize'] - (2 if top_n > 5 else 0) # 如果类别多,稍微减小字号 if tick_label_fontsize < 8: tick_label_fontsize = 8 # 最小字号 plt.setp(ax.get_xticklabels(), rotation=30, ha="right", fontsize=tick_label_fontsize) ax.grid(axis='y', linestyle='--', alpha=0.7) # 调整 rect 的 bottom 值,给 x 轴标签更多空间 fig.tight_layout(rect=[0, 0.20 if top_n > 3 else 0.1, 1, 0.95]) # 如果标签多,底部空间给大一点 save_path = os.path.join(output_dir, f"power_by_end_cause_top{top_n}.png") plt.savefig(save_path); print(f"按终止原因分布的充电量图 (Top {top_n}) 已保存至: {save_path}") def plot_temp_vs_power_scatter(cleaned_data, output_dir): if not cleaned_data: return temperatures = [r['temperature'] for r in cleaned_data if r['temperature'] is not None and r['transaction_power'] is not None] powers = [r['transaction_power'] for r in cleaned_data if r['temperature'] is not None and r['transaction_power'] is not None] if not temperatures or not powers or len(temperatures) < 2: print("无足够的有效温度和充电量数据用于绘制散点图。"); return fig, ax = plt.subplots(figsize=(12, 7)) ax.scatter(temperatures, powers, alpha=0.4, edgecolors='w', s=40, c='steelblue') ax.set_title('温度与充电量关系散点图', fontweight='bold'); ax.set_xlabel('温度 (℃)'); ax.set_ylabel('充电量 (kWh)') ax.grid(True, linestyle='--', alpha=0.7); fig.tight_layout() save_path = os.path.join(output_dir, "temperature_vs_power_scatter.png") plt.savefig(save_path); print(f"温度与充电量散点图已保存至: {save_path}") def plot_district_statistics(district_stats, output_dir): if not district_stats: print("无有效的区域统计数据用于绘图。"); return sorted_districts_items = sorted(district_stats.items()) if not sorted_districts_items: print("排序后无区域数据用于绘图。"); return districts = [item[0] for item in sorted_districts_items] avg_powers = [item[1]['avg_power'] for item in sorted_districts_items] std_dev_durations = [item[1]['std_dev_duration'] for item in sorted_districts_items] median_service_fees = [item[1]['median_service_fee'] for item in sorted_districts_items] x = np.arange(len(districts)); width = 0.35 fig_base_width = 8 fig_width = max(fig_base_width, len(districts) * 1.5 + 2) # 动态计算宽度 fig_height = 8 # 增加默认高度 def _plot_bar_chart(ax, x_data, y_data, width, label_text, color, y_label_text, title_text, districts_labels, bar_label_padding=5): rects = ax.bar(x_data, y_data, width, label=label_text, color=color) ax.set_ylabel(y_label_text); ax.set_title(title_text, fontweight='bold') ax.set_xticks(x_data); ax.set_xticklabels(districts_labels, rotation=45, ha="right") ax.legend(loc='upper right'); ax.bar_label(rects, padding=bar_label_padding, fmt='%.2f') # 自动调整y轴上限以确保标签可见 if y_data: max_val_with_label = max(y_data) * 1.15 # 假设标签不会超出bar高度的15% current_top = ax.get_ylim()[1] ax.set_ylim(top=max(current_top, max_val_with_label)) # 图1: 平均充电量 fig1, ax1 = plt.subplots(figsize=(fig_width, fig_height)); _plot_bar_chart(ax1, x, avg_powers, width, '平均充电量', 'cornflowerblue', '平均充电量 (kWh)', '各区域平均充电量', districts) fig1.tight_layout(rect=[0, 0.25, 1, 0.93]) # 增加底部和顶部边距 save_path1 = os.path.join(output_dir, "district_avg_power.png"); plt.savefig(save_path1); print(f"区域平均充电量图已保存至: {save_path1}") # 图2: 充电时长标准差 fig2, ax2 = plt.subplots(figsize=(fig_width, fig_height)); _plot_bar_chart(ax2, x, std_dev_durations, width, '充电时长标准差', 'lightcoral', '充电时长标准差 (小时)', '各区域充电时长标准差', districts) fig2.tight_layout(rect=[0, 0.25, 1, 0.93]) save_path2 = os.path.join(output_dir, "district_duration_stddev.png"); plt.savefig(save_path2); print(f"区域充电时长标准差图已保存至: {save_path2}") # 图3: 服务费中位数 fig3, ax3 = plt.subplots(figsize=(fig_width, fig_height)); _plot_bar_chart(ax3, x, median_service_fees, width, '服务费中位数', 'mediumseagreen', '服务费中位数 (元)', '各区域服务费中位数', districts) fig3.tight_layout(rect=[0, 0.25, 1, 0.93]) save_path3 = os.path.join(output_dir, "district_median_service_fee.png"); plt.savefig(save_path3); print(f"区域服务费中位数图已保存至: {save_path3}") # --- 主函数 --- if __name__ == '__main__': configure_fonts_and_styles() cleaned_data, district_stats_aggregated = process_data() if cleaned_data: print("\n--- 开始生成可视化图表与分析 ---") print("\n--- Pearson 相关系数计算 ---") temperatures_for_corr = [r['temperature'] for r in cleaned_data] powers_for_corr = [r['transaction_power'] for r in cleaned_data] pearson_corr = calculate_pearson_correlation(temperatures_for_corr, powers_for_corr) valid_pairs_count = len([1 for t, p in zip(temperatures_for_corr, powers_for_corr) if t is not None and p is not None and isinstance(t, (int,float)) and isinstance(p, (int,float))]) if valid_pairs_count > 1: print(f"Temperature与Transaction power的Pearson相关系数: {pearson_corr:.4f} (基于 {valid_pairs_count} 个有效数据点对)") else: print("没有足够的有效数据来计算Temperature与Transaction power的Pearson相关系数。") print("\n--- 生成图表 ---") plot_transaction_power_distribution(cleaned_data, OUTPUT_DIR) plot_power_by_end_cause(cleaned_data, OUTPUT_DIR, top_n=5) plot_temp_vs_power_scatter(cleaned_data, OUTPUT_DIR) if district_stats_aggregated: plot_district_statistics(district_stats_aggregated, OUTPUT_DIR) else: print("无区域统计数据,跳过相关图表绘制。") print(f"\n所有图表已尝试保存到 '{OUTPUT_DIR}' 文件夹中。") print("正在尝试显示所有图表...") plt.show() print("所有图表窗口已尝试显示完毕。") else: print("没有清洗后的数据可用于可视化。") print("\n可视化任务结束。")

filetype

Q21: Which of the following is a valid user-defined output stream manipulator header? a. ostream& tab( ostream& output ) b. ostream tab( ostream output ) c. istream& tab( istream output ) d. void tab( ostream& output ) Q22: What will be output by the following statement? cout << showpoint << setprecision(4) << 11.0 << endl; a. 11 b. 11.0 c. 11.00 d. 11.000 Q23: Which of the following stream manipulators causes an outputted number’s sign to be left justified, its magnitude to be right justified and the center space to be filled with fill characters? a. left b. right c. internal d. showpos Q24: Which of the following statements restores the default fill character? a. cout.defaultFill(); b. cout.fill(); c. cout.fill( 0 ); d. cout.fill( ' ' ); Q25: When the showbase flag is set: a. The base of a number precedes it in brackets. b. Decimal numbers are not output any differently. c. "oct" or "hex" will be displayed in the output stream. d. Octal numbers can appear in one of two ways. Q26: What will be output by the following statements? double x = .0012345; cout << fixed << x << endl; cout << scientific << x << endl; a. 1.234500e-003 0.001235 b. 1.23450e-003 0.00123450 c. .001235 1.234500e-003 d. 0.00123450 1.23450e-003 Q27: Which of the following outputs does not guarantee that the uppercase flag has been set? a. All hexadecimal numbers appear in the form 0X87. b. All numbers written in scientific notation appear the form 6.45E+010. c. All text outputs appear in the form SAMPLE OUTPUT. d. All hexadecimal numbers appear in the form AF6. Q28: Which of the following is not true about bool values and how they're output with the output stream? a. The old style of representing true/false values used -1 to indicate false and 1 to indicate true. b. A bool value outputs as 0 or 1 by default. c. Stream manipulator boolalpha sets the output stream to display bool values as the strings "true" and "false". d. Both boolalpha and noboolalpha are “sticky” settings.

mYlEaVeiSmVp
  • 粉丝: 2361
上传资源 快速赚钱