看完这个Spark+Django的营养分析系统，你还敢说大数据毕设很难做吗？-CSDN博客

本文链接：https://blog.csdn.net/2501_92997004/article/details/150419496

前言

💖💖作者：计算机程序员小杨
💙💙个人简介：我是一名计算机相关专业的从业者，擅长Java、微信小程序、Python、Golang、安卓Android等多个IT方向。会做一些项目定制化开发、代码讲解、答辩教学、文档编写、也懂一些降重方面的技巧。热爱技术，喜欢钻研新工具和框架，也乐于通过代码解决实际问题，大家有技术代码这一块的问题可以问我！
💛💛想说的话：感谢大家的关注与支持！
💕💕文末获取源码联系计算机程序员小杨
💜💜
网站实战项目
 安卓/小程序实战项目
 大数据实战项目
 深度学习实战项目
 计算机毕业设计选题
💜💜

一.开发工具简介

大数据框架：Hadoop+Spark（本次没用Hive，支持定制）
开发语言：Python+Java（两个版本都支持）
后端框架：Django+Spring Boot(Spring+SpringMVC+Mybatis)（两个版本都支持）
前端：Vue+ElementUI+Echarts+HTML+CSS+JavaScript+jQuery
详细技术点：Hadoop、HDFS、Spark、Spark SQL、Pandas、NumPy
数据库：MySQL

二.系统内容简介

基于大数据的食物营养数据可视化分析系统是一款运用现代大数据技术栈构建的综合性营养分析平台，该系统采用Hadoop分布式存储框架和Spark大数据处理引擎作为核心技术架构，结合Python数据科学库Pandas、NumPy进行深度数据挖掘，通过Django后端框架提供稳定的API服务，前端采用Vue+ElementUI+Echarts技术栈实现直观的数据可视化展示。系统功能涵盖完整的用户管理体系、食物营养信息管理、宏观营养格局分析、食物营养排行分析、食物分类对比分析、膳食健康风险评估以及高级算法探索分析等多个核心模块，能够对海量食物营养数据进行多维度统计分析和智能化处理。通过Spark SQL进行高效的数据查询和计算，系统能够实时生成各类营养成分的对比图表、趋势分析和健康风险预警，为用户提供科学的膳食指导建议。整个系统基于MySQL数据库进行数据持久化存储，运用HDFS分布式文件系统处理大规模营养数据集，结合JavaScript、jQuery等前端技术实现流畅的用户交互体验，是一个集数据采集、存储、分析、可视化于一体的现代化食物营养管理平台。

三.系统功能演示

看完这个Spark+Django的营养分析系统，你还敢说大数据毕设很难做吗？

四.系统界面展示

大屏幕
高级算法探索分析
宏观营养格局分析
膳食健康风险分析
食物分类对比分析
食物营养排行分析
食物营养信息
用户管理

五.系统源码展示


# 核心功能1：宏观营养格局分析
def macro_nutrition_analysis(request):
    # 从Spark中获取食物营养数据
    spark_session = get_spark_session()
    nutrition_df = spark_session.sql("""
        SELECT food_category, protein, fat, carbohydrate, calories, fiber, 
               vitamin_a, vitamin_c, calcium, iron, COUNT(*) as food_count
        FROM food_nutrition 
        GROUP BY food_category, protein, fat, carbohydrate, calories, fiber, vitamin_a, vitamin_c, calcium, iron
    """)
    
    # 计算各类营养素的平均值和占比
    category_stats = {}
    total_foods = nutrition_df.count()
    
    for row in nutrition_df.collect():
        category = row['food_category']
        if category not in category_stats:
            category_stats[category] = {
                'avg_protein': 0, 'avg_fat': 0, 'avg_carbs': 0, 'avg_calories': 0,
                'food_count': 0, 'total_fiber': 0, 'total_vitamin_a': 0,
                'total_vitamin_c': 0, 'total_calcium': 0, 'total_iron': 0
            }
        
        stats = category_stats[category]
        stats['avg_protein'] += row['protein']
        stats['avg_fat'] += row['fat']
        stats['avg_carbs'] += row['carbohydrate']
        stats['avg_calories'] += row['calories']
        stats['total_fiber'] += row['fiber']
        stats['total_vitamin_a'] += row['vitamin_a']
        stats['total_vitamin_c'] += row['vitamin_c']
        stats['total_calcium'] += row['calcium']
        stats['total_iron'] += row['iron']
        stats['food_count'] += row['food_count']
    
    # 计算营养素分布比例和健康指标
    analysis_result = []
    for category, stats in category_stats.items():
        count = stats['food_count']
        category_percentage = (count / total_foods) * 100
        
        avg_data = {
            'category': category,
            'percentage': round(category_percentage, 2),
            'avg_protein': round(stats['avg_protein'] / count, 2),
            'avg_fat': round(stats['avg_fat'] / count, 2),
            'avg_carbs': round(stats['avg_carbs'] / count, 2),
            'avg_calories': round(stats['avg_calories'] / count, 2),
            'nutrition_density': round((stats['total_fiber'] + stats['total_vitamin_a'] + 
                                     stats['total_vitamin_c'] + stats['total_calcium'] + 
                                     stats['total_iron']) / count, 2),
            'food_count': count
        }
        analysis_result.append(avg_data)
    
    # 按营养密度排序并标记健康等级
    analysis_result.sort(key=lambda x: x['nutrition_density'], reverse=True)
    for i, item in enumerate(analysis_result):
        if i < len(analysis_result) * 0.3:
            item['health_level'] = '优质营养'
        elif i < len(analysis_result) * 0.7:
            item['health_level'] = '均衡营养'
        else:
            item['health_level'] = '需要改善'
    
    return JsonResponse({'status': 'success', 'data': analysis_result})

# 核心功能2：食物营养排行分析
def nutrition_ranking_analysis(request):
    nutrient_type = request.GET.get('nutrient_type', 'protein')
    limit = int(request.GET.get('limit', 20))
    analysis_type = request.GET.get('analysis_type', 'highest')
    
    # 使用Spark SQL进行高效排序查询
    spark_session = get_spark_session()
    
    if analysis_type == 'highest':
        order_clause = f"ORDER BY {nutrient_type} DESC"
    else:
        order_clause = f"ORDER BY {nutrient_type} ASC"
    
    ranking_df = spark_session.sql(f"""
        SELECT food_name, food_category, {nutrient_type}, calories, protein, fat, 
               carbohydrate, fiber, vitamin_a, vitamin_c, calcium, iron,
               ROUND({nutrient_type}/calories*100, 2) as nutrient_calorie_ratio
        FROM food_nutrition 
        WHERE {nutrient_type} > 0 AND calories > 0
        {order_clause}
        LIMIT {limit}
    """)
    
    ranking_data = []
    for i, row in enumerate(ranking_df.collect()):
        # 计算营养价值评分
        nutrition_score = (
            (row['protein'] * 0.25) + (row['fiber'] * 0.20) + 
            (row['vitamin_a'] * 0.15) + (row['vitamin_c'] * 0.15) + 
            (row['calcium'] * 0.15) + (row['iron'] * 0.10)
        ) / row['calories'] * 100
        
        # 计算该营养素的相对含量等级
        nutrient_value = row[nutrient_type]
        if nutrient_value >= 15:
            level = '极高'
        elif nutrient_value >= 10:
            level = '高'
        elif nutrient_value >= 5:
            level = '中等'
        else:
            level = '较低'
        
        ranking_item = {
            'rank': i + 1,
            'food_name': row['food_name'],
            'category': row['food_category'],
            'nutrient_value': nutrient_value,
            'nutrient_calorie_ratio': row['nutrient_calorie_ratio'],
            'nutrition_score': round(nutrition_score, 2),
            'level': level,
            'calories': row['calories'],
            'comprehensive_data': {
                'protein': row['protein'],
                'fat': row['fat'],
                'carbohydrate': row['carbohydrate'],
                'fiber': row['fiber'],
                'vitamin_a': row['vitamin_a'],
                'vitamin_c': row['vitamin_c'],
                'calcium': row['calcium'],
                'iron': row['iron']
            }
        }
        ranking_data.append(ranking_item)
    
    # 生成排行分析报告
    total_avg = sum([item['nutrient_value'] for item in ranking_data]) / len(ranking_data)
    analysis_report = {
        'nutrient_type': nutrient_type,
        'analysis_type': analysis_type,
        'total_count': len(ranking_data),
        'average_value': round(total_avg, 2),
        'top_category': ranking_data[0]['category'] if ranking_data else None,
        'ranking_data': ranking_data
    }
    
    return JsonResponse({'status': 'success', 'data': analysis_report})

# 核心功能3：膳食健康风险分析
def dietary_health_risk_analysis(request):
    user_id = request.POST.get('user_id')
    daily_foods = json.loads(request.POST.get('daily_foods', '[]'))
    user_profile = json.loads(request.POST.get('user_profile', '{}'))
    
    # 获取用户基本信息计算基础代谢
    age = user_profile.get('age', 25)
    gender = user_profile.get('gender', 'male')
    weight = user_profile.get('weight', 70)
    height = user_profile.get('height', 170)
    activity_level = user_profile.get('activity_level', 'moderate')
    
    # 计算基础代谢率和推荐摄入量
    if gender == 'male':
        bmr = 88.362 + (13.397 * weight) + (4.799 * height) - (5.677 * age)
    else:
        bmr = 447.593 + (9.247 * weight) + (3.098 * height) - (4.330 * age)
    
    activity_multipliers = {'low': 1.2, 'moderate': 1.55, 'high': 1.725, 'very_high': 1.9}
    daily_calorie_need = bmr * activity_multipliers.get(activity_level, 1.55)
    
    # 使用Spark处理用户摄入的食物数据
    spark_session = get_spark_session()
    food_names = [food['name'] for food in daily_foods]
    
    if food_names:
        food_nutrition_df = spark_session.sql(f"""
            SELECT food_name, protein, fat, carbohydrate, calories, fiber,
                   vitamin_a, vitamin_c, calcium, iron, sodium, sugar
            FROM food_nutrition 
            WHERE food_name IN ({','.join([f"'{name}'" for name in food_names])})
        """)
        
        # 计算总营养摄入量
        total_nutrition = {
            'calories': 0, 'protein': 0, 'fat': 0, 'carbohydrate': 0,
            'fiber': 0, 'vitamin_a': 0, 'vitamin_c': 0, 'calcium': 0,
            'iron': 0, 'sodium': 0, 'sugar': 0
        }
        
        food_intake_map = {food['name']: food.get('amount', 100) for food in daily_foods}
        
        for row in food_nutrition_df.collect():
            food_name = row['food_name']
            amount_ratio = food_intake_map.get(food_name, 100) / 100
            
            total_nutrition['calories'] += row['calories'] * amount_ratio
            total_nutrition['protein'] += row['protein'] * amount_ratio
            total_nutrition['fat'] += row['fat'] * amount_ratio
            total_nutrition['carbohydrate'] += row['carbohydrate'] * amount_ratio
            total_nutrition['fiber'] += row['fiber'] * amount_ratio
            total_nutrition['vitamin_a'] += row['vitamin_a'] * amount_ratio
            total_nutrition['vitamin_c'] += row['vitamin_c'] * amount_ratio
            total_nutrition['calcium'] += row['calcium'] * amount_ratio
            total_nutrition['iron'] += row['iron'] * amount_ratio
            total_nutrition['sodium'] += row['sodium'] * amount_ratio
            total_nutrition['sugar'] += row['sugar'] * amount_ratio
        
        # 健康风险评估算法
        risk_factors = []
        risk_score = 0
        
        # 热量摄入风险评估
        calorie_ratio = total_nutrition['calories'] / daily_calorie_need
        if calorie_ratio > 1.2:
            risk_factors.append({'type': '热量过剩', 'level': '高风险', 'description': '每日热量摄入超标20%以上'})
            risk_score += 25
        elif calorie_ratio < 0.8:
            risk_factors.append({'type': '热量不足', 'level': '中风险', 'description': '每日热量摄入不足20%'})
            risk_score += 15
        
        # 宏量营养素比例风险
        protein_calorie_percent = (total_nutrition['protein'] * 4) / total_nutrition['calories'] * 100
        fat_calorie_percent = (total_nutrition['fat'] * 9) / total_nutrition['calories'] * 100
        carb_calorie_percent = (total_nutrition['carbohydrate'] * 4) / total_nutrition['calories'] * 100
        
        if protein_calorie_percent < 10:
            risk_factors.append({'type': '蛋白质不足', 'level': '中风险', 'description': '蛋白质占总热量比例低于10%'})
            risk_score += 20
        if fat_calorie_percent > 35:
            risk_factors.append({'type': '脂肪过量', 'level': '高风险', 'description': '脂肪占总热量比例超过35%'})
            risk_score += 30
        if total_nutrition['sodium'] > 2300:
            risk_factors.append({'type': '钠摄入过量', 'level': '高风险', 'description': '每日钠摄入量超过建议值'})
            risk_score += 25
        
        # 微量元素缺乏风险
        if total_nutrition['fiber'] < 25:
            risk_factors.append({'type': '膳食纤维不足', 'level': '中风险', 'description': '每日膳食纤维摄入不足'})
            risk_score += 15
        if total_nutrition['vitamin_c'] < 90:
            risk_factors.append({'type': '维生素C不足', 'level': '中风险', 'description': '维生素C摄入量低于推荐值'})
            risk_score += 10
        
        # 生成健康建议
        health_suggestions = []
        if calorie_ratio > 1.2:
            health_suggestions.append('建议减少高热量食物摄入，增加蔬菜和低热量食物比例')
        if protein_calorie_percent < 10:
            health_suggestions.append('建议增加优质蛋白质来源，如瘦肉、鱼类、豆类等')
        if total_nutrition['fiber'] < 25:
            health_suggestions.append('建议增加全谷物、蔬菜和水果的摄入量')
        if total_nutrition['sodium'] > 2300:
            health_suggestions.append('建议减少加工食品和高盐调味品的使用')
        
        risk_analysis_result = {
            'total_nutrition': total_nutrition,
            'daily_need': daily_calorie_need,
            'nutrition_ratios': {
                'protein_percent': round(protein_calorie_percent, 1),
                'fat_percent': round(fat_calorie_percent, 1),
                'carb_percent': round(carb_calorie_percent, 1)
            },
            'risk_score': min(risk_score, 100),
            'risk_level': '高风险' if risk_score >= 60 else '中风险' if risk_score >= 30 else '低风险',
            'risk_factors': risk_factors,
            'health_suggestions': health_suggestions
        }
        
        return JsonResponse({'status': 'success', 'data': risk_analysis_result})
    else:
        return JsonResponse({'status': 'error', 'message': '未提供有效的食物数据'})