2024-11-30

Java内存管理与调优实践

第一章：内存泄漏分析（MemoryLeakDemo）

1.1 实验环境

# JDK版本
java version "11.0.25" 2024-10-15
OpenJDK Runtime Environment (build 11.0.25+8-post-Ubuntu-1ubuntu122.04)
OpenJDK 64-Bit Server VM (build 11.0.25+8-post-Ubuntu-122.04, mixed mode, sharing)

# 操作系统
WSL2 (Ubuntu 22.04)

1.2 示例代码

package com.example;

import java.util.ArrayList;
import java.util.List;

/**
 * 这是一个演示内存泄漏的示例程序
 * 通过持续创建并保持对大对象的引用导致内存泄漏
 */
public class MemoryLeakDemo {
    // 使用静态List持续存储对象引用
    private static List<byte[]> list = new ArrayList<>();

    public static void main(String[] args) {
        System.out.println("开始运行内存泄漏演示程序...");
        System.out.println("当前JVM的最大堆内存: " + 
            Runtime.getRuntime().maxMemory() / 1024 / 1024 + "MB");
        
        try {
            int count = 0;
            while (true) {
                // 持续创建1MB大小的字节数组
                list.add(new byte[1024 * 1024]);
                count++;
                
                // 每10MB输出一次状态
                if (count % 10 == 0) {
                    System.out.println("已分配内存: " + count + "MB");
                    Thread.sleep(100); // 添加延迟，方便观察
                }
            }
        } catch (OutOfMemoryError e) {
            System.out.println("发生内存溢出错误！");
            System.out.println("总共分配了: " + list.size() + "MB 的内存");
            throw e;
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }
}

1.3 运行与分析

1.3.1 运行命令

# 编译
mvn clean package

# 运行（添加JVM参数）（jdk8之前)
java -Xmx512m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=./dump.hprof -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:gc.log -jar target/java-jvm-tuning-1.0-SNAPSHOT.jar

-Xmx512m: 限制最大堆内存为512MB
-XX:+HeapDumpOnOutOfMemoryError: 在发生OOM时自动生成堆转储文件
-XX:HeapDumpPath=./dump.hprof: 指定堆转储文件的位置
-XX:+PrintGCDetails: 打印详细的GC日志
-XX:+PrintGCDateStamps: 在GC日志中包含时间戳
-Xloggc:gc.log: 将GC日志输出到文件 jdk9之后的命令：

java -Xmx512m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=./dump.hprof -Xlog:gc*=debug:file=gc.log:time,uptime,level,tags -jar target/java-jvm-tuning-1.0-SNAPSHOT.jar

-Xmx512m: 限制最大堆内存为512MB
-XX:+HeapDumpOnOutOfMemoryError: 在发生OOM时自动生成堆转储文件
-XX:HeapDumpPath=./dump.hprof: 指定堆转储文件的位置
-Xlog:gc*=debug:file=gc.log:time,uptime,level,tags: 新的GC日志配置格式，替代了旧的PrintGC相关参数
- gc*: 记录所有GC相关的事件
- debug: 日志级别
- file=gc.log: 输出到文件
- time,uptime,level,tags: 日志中包含的信息

1.3.2 运行结果

开始运行内存泄漏演示程序...
当前JVM的最大堆内存: 512MB
已分配内存: 10MB
已分配内存: 20MB
...
已分配内存: 250MB
发生内存溢出错误！
总共分配了: 253MB 的内存
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

分析一下运行结果：

JVM的最大堆内存是8006MB（约8GB）
程序持续分配内存直到3990MB时发生了OutOfMemoryError
这个错误发生在main线程中

1.3.3 问题分析

虽然我们设置了-Xmx512m，但实际可用的堆内存会小于这个值，因为：
- JVM需要预留一些空间给永久代/元空间
- 堆内存分为年轻代和老年代，有一定的比例分配
- GC需要保留一定的空余空间才能正常工作
在我们的程序中，由于不断创建新的字节数组，这些对象最终会被提升到老年代，当老年代空间不足时就会触发OOM

内存泄漏原因：
- 持续创建大对象（1MB字节数组）
- 保持对象的强引用
- 没有释放不需要的对象
影响：
- 内存持续增长
- GC无法回收
- 最终导致OOM

1.4 GC日志分析

1.4.1 查看gc日志

截取的一小部分日志：

[2024-12-07T17:08:09.670+0800][0.018s][info][gc,heap] Heap region size: 1M
[2024-12-07T17:08:09.670+0800][0.018s][debug][gc,heap] Minimum heap 8388608  Initial heap 526385152  Maximum heap 536870912
[2024-12-07T17:08:09.671+0800][0.019s][debug][gc     ] ConcGCThreads: 5 offset 42
[2024-12-07T17:08:09.671+0800][0.020s][debug][gc     ] ParallelGCThreads: 18
[2024-12-07T17:08:09.672+0800][0.020s][debug][gc     ] Initialize mark stack with 4096 chunks, maximum 16384
[2024-12-07T17:08:09.672+0800][0.020s][debug][gc,ergo,heap] Expand the heap. requested expansion amount: 526385152B expansion amount: 526385152B
[2024-12-07T17:08:09.675+0800][0.023s][debug][gc,ihop     ] Target occupancy update: old: 0B, new: 526385152B
[2024-12-07T17:08:09.675+0800][0.023s][debug][gc,ergo,refine] Initial Refinement Zones: green: 18, yellow: 54, red: 90, min yellow size: 36
[2024-12-07T17:08:09.676+0800][0.024s][info ][gc            ] Using G1
[2024-12-07T17:08:09.676+0800][0.024s][info ][gc,heap,coops ] Heap address: 0x00000000e0000000, size: 512 MB, Compressed Oops mode: 32-bit
[2024-12-07T17:08:09.677+0800][0.025s][debug][gc,ergo,heap  ] Attempt heap expansion (allocate archive regions). Total size: 1048576B
[2024-12-07T17:08:09.677+0800][0.025s][info ][gc,cds        ] Mark closed archive regions in map: [0x00000000ffe00000, 0x00000000ffe6bff8]
[2024-12-07T17:08:09.677+0800][0.025s][debug][gc,ergo,heap  ] Attempt heap expansion (allocate archive regions). Total size: 1048576B
[2024-12-07T17:08:09.677+0800][0.025s][info ][gc,cds        ] Mark open archive regions in map: [0x00000000ffc00000, 0x00000000ffc47ff8]
[2024-12-07T17:08:10.889+0800][1.237s][debug][gc,ergo,ihop  ] Request concurrent cycle initiation (occupancy higher than threshold) occupancy: 236978176B allocation request: 1048592B threshold: 236873318B (44.82) source: concurrent humongous allocation
[2024-12-07T17:08:10.889+0800][1.237s][debug][gc,ergo       ] Request concurrent cycle initiation (requested by GC cause). GC cause: G1 Humongous Allocation
[2024-12-07T17:08:10.890+0800][1.238s][debug][gc,heap       ] GC(0) Heap before GC invocations=0 (full 0): garbage-first heap   total 516096K, used 231120K [0x00000000e0000000, 0x0000000100000000)
[2024-12-07T17:08:10.890+0800][1.238s][debug][gc,heap       ] GC(0)   region size 1024K, 2 young (2048K), 0 survivors (0K)
[2024-12-07T17:08:10.890+0800][1.238s][debug][gc,heap       ] GC(0)  Metaspace       used 558K, capacity 4570K, committed 4864K, reserved 1056768K
[2024-12-07T17:08:10.890+0800][1.238s][debug][gc,heap       ] GC(0)   class space    used 46K, capacity 419K, committed 512K, reserved 1048576K
[2024-12-07T17:08:10.890+0800][1.238s][debug][gc,ergo       ] GC(0) Initiate concurrent cycle (concurrent cycle initiation requested)
[2024-12-07T17:08:10.890+0800][1.238s][info ][gc,start      ] GC(0) Pause Young (Concurrent Start) (G1 Humongous Allocation)
[2024-12-07T17:08:10.891+0800][1.239s][info ][gc,task       ] GC(0) Using 12 workers of 18 for evacuation
[2024-12-07T17:08:10.891+0800][1.239s][debug][gc,tlab       ] GC(0) TLAB totals: thrds: 3  refills: 5 max: 3 slow allocs: 113 max 113 waste: 49.2% gc: 1031440B max: 524256B slow: 1168B max: 1168B fast: 0B max: 0B
[2024-12-07T17:08:10.892+0800][1.240s][debug][gc,alloc,region] GC(0) Mutator Allocation stats, regions: 2, wasted size: 0B ( 0.0%)
[2024-12-07T17:08:10.892+0800][1.240s][debug][gc,age         ] GC(0) Desired survivor size 2097152 bytes, new threshold 15 (max threshold 15)
[2024-12-07T17:08:10.892+0800][1.240s][debug][gc,ergo,cset   ] GC(0) Finish choosing CSet. old: 0 regions, predicted old region time: 0.00ms, time remaining: 158.74
[2024-12-07T17:08:10.892+0800][1.240s][debug][gc,task,stats  ] GC(0) GC Termination Stats
[2024-12-07T17:08:10.892+0800][1.240s][debug][gc,task,stats  ] GC(0)      elapsed  --strong roots-- -------termination------- ------waste (KiB)------
[2024-12-07T17:08:10.892+0800][1.240s][debug][gc,task,stats  ] GC(0) thr     ms        ms      %        ms      %    attempts  total   alloc    undo
[2024-12-07T17:08:10.892+0800][1.240s][debug][gc,task,stats  ] GC(0) --- --------- --------- ------ --------- ------ -------- ------- ------- -------
[2024-12-07T17:08:10.897+0800][1.245s][debug][gc,task,stats  ] GC(0)   1      4.11      4.10  99.85      0.00   0.00        1       0       0       0
[2024-12-07T17:08:10.897+0800][1.245s][debug][gc,task,stats  ] GC(0)  11      3.68      2.03  55.16      1.58  42.99        1       0       0       0
[2024-12-07T17:08:10.897+0800][1.245s][debug][gc,task,stats  ] GC(0)   5      4.41      1.03  23.36      2.91  66.03      107       0       0       0
[2024-12-07T17:08:10.897+0800][1.245s][debug][gc,task,stats  ] GC(0)   3      4.54      1.09  24.00      2.86  62.94      123       0       0       0
[2024-12-07T17:08:10.897+0800][1.245s][debug][gc,task,stats  ] GC(0)   6      4.73      1.04  21.96      2.88  60.95       96       0       0       0
[2024-12-07T17:08:10.897+0800][1.245s][debug][gc,task,stats  ] GC(0)   0      4.88      3.73  76.51      0.43   8.90        1       0       0       0
[2024-12-07T17:08:10.897+0800][1.246s][debug][gc,task,stats  ] GC(0)  10      4.81      3.80  78.85      0.20   4.16        1       0       0       0
[2024-12-07T17:08:10.898+0800][1.246s][debug][gc,task,stats  ] GC(0)   8      4.95      1.01  20.44      2.74  55.27      147       0       0       0
[2024-12-07T17:08:10.898+0800][1.246s][debug][gc,task,stats  ] GC(0)   4      5.13      4.02  78.25      0.07   1.30        1       0       0       0
[2024-12-07T17:08:10.898+0800][1.246s][debug][gc,task,stats  ] GC(0)   7      5.20      1.01  19.48      2.82  54.23       75       0       0       0
[2024-12-07T17:08:10.898+0800][1.246s][debug][gc,task,stats  ] GC(0)   2      5.33      3.66  68.63      0.44   8.27        1       1       1       0
[2024-12-07T17:08:10.898+0800][1.246s][debug][gc,task,stats  ] GC(0)   9      5.30      3.70  69.70      0.27   5.05        1       0       0       0
[2024-12-07T17:08:10.898+0800][1.246s][debug][gc,ergo        ] GC(0) Running G1 Clear Card Table Task using 1 workers for 1 units of work for 3 regions.
[2024-12-07T17:08:10.898+0800][1.246s][debug][gc,ref         ] GC(0) Skipped phase1 of Reference Processing due to unavailable references
[2024-12-07T17:08:10.898+0800][1.246s][debug][gc,ref         ] GC(0) Skipped phase3 of Reference Processing due to unavailable references
[2024-12-07T17:08:10.898+0800][1.246s][debug][gc,ergo        ] GC(0) Running G1 Free Collection Set using 1 workers for collection set length 2

![[Pasted image 20241207232841.png]]

从GC日志中我们可以看到一些重要信息：

JVM使用的是G1垃圾收集器（Using G1）
每个内存区域（region）大小是1MB（Heap region size: 1M）
我们创建的1MB大小的字节数组被视为"巨型对象"（humongous object）
这些巨型对象直接分配在老年代中，而且由于它们的大小，每个对象至少占用一个完整的region
分析dump.hprof文件 dump.hprof文件包含了发生OOM时的堆内存快照

1.4.2 gc 日志解析

关于巨型对象（Humongous Objects）

在G1垃圾收集器中，大小超过单个Region大小一半的对象被称为巨型对象
我们的Region大小是1MB（从日志中可以看到 Heap region size: 1M）
所以超过512KB的对象就会被视为巨型对象
new byte[1024 * 1024] 计算过程：
- 1024（1K） * 1024（1K） = 1,048,576 字节（1MB）
- 1024是2的10次方，所以 1024 * 1024 = 2^20 字节 = 1MB

GC日志关键部分解析：

[2024-12-07T17:08:09.671+0800][0.020s][debug][gc] ConcGCThreads: 5 offset 42
[2024-12-07T17:08:09.671+0800][0.020s][debug][gc] ParallelGCThreads: 18

ConcGCThreads: 5 - 并发GC线程数，用于并发标记等后台任务
ParallelGCThreads: 18 - 并行GC线程数，用于STW（Stop-The-World）阶段

[2024-12-07T17:08:10.891+0800][1.239s][debug][gc,tlab] GC(0) TLAB totals: thrds: 3 refills: 5 max: 3 slow allocs: 113 max 113 waste: 49.2% ...

TLAB（Thread Local Allocation Buffer）信息：

thrds: 3 - 3个线程在使用TLAB
refills: 5 - TLAB重新填充次数
waste: 49.2% - TLAB空间浪费率

[2024-12-07T17:08:10.892+0800][1.240s][debug][gc,age] GC(0) Desired survivor size 2097152 bytes, new threshold 15 (max threshold 15)

存活对象年龄信息：

Desired survivor size 2097152 bytes - 期望的存活区大小约2MB
new threshold 15 - 对象在进入老年代前最多经历15次GC

[2024-12-07T17:08:10.892+0800][1.240s][debug][gc,task,stats] GC(0) GC Termination Stats

GC任务统计信息：

elapsed - 任务耗时
strong roots - 强引用根扫描时间
termination - GC终止阶段时间
waste - 内存浪费统计

gc 日志中重点关注：
GC原因和类型
内存使用情况
GC暂停时间
内存回收效果

1.4.3 概念基础

GC线程类型区别：

ParallelGCThreads（STW线程）：
- STW（Stop-The-World）意味着暂停所有应用线程
- 在GC时必须停止所有Java业务线程的阶段使用
- 比如：根对象扫描、对象复制等必须暂停应用的操作
ConcGCThreads（并发线程）：
- 可以与应用线程并发执行的GC线程
- 主要用于后台标记、清理等可以与应用并发的任务
- 通常数量比ParallelGCThreads少，默认为ParallelGCThreads的1/4

TLAB（Thread Local Allocation Buffer）：

即使不显式使用ThreadLocal，JVM也会为每个线程分配一个TLAB
TLAB的目的：
- 避免多线程分配对象时的同步问题
- 每个线程都有自己的一小块内存空间用于分配对象
- 只有TLAB用完才需要同步分配新的TLAB

GC Roots（根对象）：

就像是寻找"有用物品"的起点
从这些起点可以找到的对象就是"存活"的
典型的GC Roots包括：
- 正在执行的方法里的局部变量
- static变量（就像我们Demo中的static List）
- JNI引用的对象

Survivor区大小与Region大小的关系：

Heap Region size (1MB) 是G1 GC的最小内存管理单位
Survivor区实际上是由多个Region组成的
gc日志中显示Survivor区大小：2MB的Survivor期望大小意味着会使用2个Region
这种设计允许更灵活的内存管理：Survivor区可以根据需要增减Region数量

GC日志与堆转储的区别：

GC日志（gc.log）：
- 记录垃圾收集过程
- 包含内存分配和回收的动态信息
- 适合分析GC性能和内存使用趋势
堆转储（heap dump）：
- 某一时刻的内存快照
- 包含所有对象的详细信息和引用关系
- 适合分析内存泄漏和对象关系

1.4 使用VisualVM分析

直接从官网下载运行即可

实时监控工具，提供图形界面
主要功能：
- CPU和内存使用实时监控
- 线程状态可视化
- 堆内存分布图表
适合实时监控和快速诊断 ![[Pasted image 20241207234002.png]]
点击左侧的"File" -> “Load”

加载后，你会看到几个主要标签页：
- Summary（概览）：显示基本信息
- Classes（类）：显示所有加载的类及其实例数量
- Instances（实例）：可以查看具体对象实例
- References（引用）：显示对象之间的引用关系
重点关注的内容：
- Classes标签页中按实例数量排序，找出数量异常多的类
- 查看大对象（Biggest Objects）
- 检查是否有明显的内存泄漏模式（比如List或Map持续增长）

1.4.1 堆转储基本信息

![[Pasted image 20241207180945.png]]

1.4.2 内存分布

主要对象分布：
1. byte[]        : 4,371实例, 266.5MB (99.72%)
2. String        : 7,105实例, 170.5KB
3. Object[]      : 1,049实例, 95.2KB

1.4.3 对象实例分析

内存溢出原因：

堆总大小：267,314,568 字节（约255MB）
发生了OutOfMemoryError
主线程(“main” prio=5 tid=1)触发了这个错误

对象分布异常：

byte[]数组占用最多内存：266,667,792 B (99.8%的堆空间！)
有5个大小相同的byte[]数组，每个都是1,048,576字节（1MB）
String对象数量很多：7,105个（28.7%的实例数）

问题原因：

大量的byte[]数组占用了几乎所有堆空间
这些数组大小都是1MB，可能是我们程序中创建了大量固定大小的缓冲区在MemoryLeakDemo.java中，我们看到：

// 每次分配1MB的内存
list.add(new byte[1024 * 1024]);

这段代码不断创建1MB的字节数组并保存在列表中，没有任何释放机制，导致：

堆中充满了大量的1MB字节数组
因为被list强引用，这些数组无法被GC回收
最终导致OutOfMemoryError

这就解释了为什么堆转储中显示：

有多个1MB（1,048,576字节）的byte数组
这些数组占用了99.8%的堆空间

1.5 使用MAT深入分析

MAT的主要分析功能：

Histogram（直方图）：
- 显示所有对象的实例数和占用空间
- 可以按包名、类名进行过滤
- 右键菜单提供多种分析选项
Dominator Tree（支配树）：
- 显示最大的对象及其依赖关系
- 帮助找出内存占用大户
- 可以看到对象之间的引用链
Leak Suspects（泄漏嫌疑）：
- 自动分析可能的内存泄漏
- 生成详细的问题报告
- 提供优化建议
Path to GC Roots：
- 分析对象为什么没有被垃圾回收
- 显示从GC Roots到目标对象的完整引用链
- 帮助找出内存泄漏的根源

使用步骤: 官网下载后后直接运行 ![[Pasted image 20241207234347.png]]

点击File -> Open Heap Dump
选择我们的dump.hprof文件
MAT会自动生成内存泄漏分析报告

1.5.1 分析报告概览

首先查看Overview（概览）
然后看Leak Suspects（泄漏嫌疑）报告
使用Histogram查看对象分布
最后用Dominator Tree分析大对象

![[Pasted image 20241207182931.png]]

内存泄漏嫌疑点：

一个ArrayList实例占用了266,343,872字节（约254MB，占总内存的99.72%）
这个ArrayList由main线程持有（“java.lang.Thread @ 0xffd008a0 main”）
具体位置：MemoryLeakDemo.java:22

问题定位：

主要内存累积在一个Object[]数组中
调用栈显示是在com.example.MemoryLeakDemo.main方法中创建
这与我们之前在VisualVM中看到的结果吻合

1.5.2 Histogram视图分析

![[Pasted image 20241207182702.png]]

主要对象分布：
1. byte[]        : 4,371实例, 266.5MB
2. String        : 7,105实例, 170.5KB
3. Object[]      : 1,049实例, 95.2KB

这与VisualVM的分析结果一致，但提供了更详细的数量统计

1.5.3 Thread Stack分析

![[Pasted image 20241207182721.png]]

main线程堆栈：
at java.lang.OutOfMemoryError.<init>()V (OutOfMemoryError.java:48)
at com.example.MemoryLeakDemo.main(MemoryLeakDemo.java:22)

1.5.4 Dominator Tree分析

![[Pasted image 20241207182553.png]] ![[Pasted image 20241207182603.png]]

内存占用层级：
1. ArrayList @ 0xffd00700    : 266,343,872字节 (99.72%)

其他类占用都非常小，最大的也只有39KB左右
清晰地显示了内存占用的层级关系

1.5.5 Path to GC Roots引用链分析

在MAT中查看对象引用链有两种方法：

通过Histogram视图：
- 在Histogram中找到目标类（比如我们的byte[]）
- 右键点击 -> List objects -> with incoming references
- 这会显示所有指向这个对象的引用
通过Dominator Tree：
- 找到我们感兴趣的对象（比如ArrayList @ 0xffd00700）
- 右键点击 -> Path To GC Roots -> exclude weak references
- 这会显示从GC根到这个对象的完整引用路径

让我们试试第二种方法，因为我们已经知道主要问题在那个ArrayList上。这样可以看到：

是谁持有了这个ArrayList
为什么GC无法回收它
完整的引用链路

![[Pasted image 20241207235107.png]] 从引用链分析结果我们可以看到一个非常清晰的路径：

ArrayList @ 0xffd00700
  ↑
java.lang.Thread @ 0xffd008a0 (main)
  ↑
list (in com.example.MemoryLeakDemo)
  ↑
appClass (sun.launcher.LauncherHelper)

通过引用链分析，我们可以清楚地看到问题的根源：

ArrayList被main线程通过静态变量强引用
这导致GC无法回收任何byte[]数组
最终因为持续创建新数组导致OOM

1.6 问题总结

内存泄漏原因：
- 静态ArrayList持续增长
- 强引用链阻止GC回收
- 没有大小限制机制
泄漏路径：
- 应用代码创建byte[]数组
- 通过list变量持有引用
- main线程保持list存活
优化方向：
- 使用软引用包装大对象
- 实现集合大小限制
- 及时清理不用的引用

第二章：内存管理优化（MemoryManagementDemo）

在第一章中的问题：

ArrayList被main线程通过静态变量强引用
这导致GC无法回收任何byte[]数组
最终因为持续创建新数组导致OOM

找到问题所在我们来对代码进行改进：

使用SoftReference替代强引用
实现集合大小限制
定期清理失效的引用

2.1 示例代码

package com.example;

import java.lang.ref.SoftReference;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.atomic.AtomicInteger;

/**
 * 这是一个展示正确内存管理的示例程序
 * 使用软引用和限制集合大小来避免内存泄漏
 */
public class MemoryManagementDemo {
    // 使用软引用存储大对象
    private static List<SoftReference<byte[]>> list = new ArrayList<>();
    // 限制最大缓存对象数量
    private static final int MAX_CACHE_SIZE = 100;
    // 使用原子计数器跟踪创建的对象总数
    private static AtomicInteger totalCreated = new AtomicInteger(0);
    // 使用原子计数器跟踪被GC回收的对象数
    private static AtomicInteger totalReclaimed = new AtomicInteger(0);

    public static void main(String[] args) throws InterruptedException {
        System.out.println("开始运行内存管理示例程序...");
        System.out.println("当前JVM的最大堆内存: " + 
            Runtime.getRuntime().maxMemory() / 1024 / 1024 + "MB");
        
        try {
            while (true) {
                // 创建1MB的字节数组
                byte[] data = new byte[1024 * 1024];
                totalCreated.incrementAndGet();
                
                // 使用软引用包装对象
                SoftReference<byte[]> ref = new SoftReference<>(data);
                
                // 限制集合大小
                if (list.size() >= MAX_CACHE_SIZE) {
                    list.remove(0);
                }
                list.add(ref);
                
                // 检查有多少对象被回收了
                int reclaimed = 0;
                for (SoftReference<byte[]> item : list) {
                    if (item.get() == null) {
                        reclaimed++;
                    }
                }
                totalReclaimed.addAndGet(reclaimed);
                
                // 每10MB输出一次状态
                if (totalCreated.get() % 10 == 0) {
                    System.out.printf("状态报告：已创建 %d MB, 已回收 %d MB, 当前缓存大小 %d%n",
                        totalCreated.get(), totalReclaimed.get(), list.size());
                    Thread.sleep(100); // 添加延迟，方便观察
                }
                
                // 清理已被回收的引用
                list.removeIf(item -> item.get() == null);
            }
        } catch (OutOfMemoryError e) {
            System.out.println("发生内存溢出错误！");
            System.out.printf("总共创建了: %d MB的内存，回收了: %d MB%n",
                totalCreated.get(), totalReclaimed.get());
            throw e;
        }
    }
}

关键改进点：

 // 1. 使用软引用
SoftReference<byte[]> ref = new SoftReference<>(new byte[1024 * 1024]);

// 2. 限制集合大小
if (list.size() >= MAX_CACHE_SIZE) {
    list.remove(0);
}

// 3. 清理已被回收的引用
list.removeIf(item -> item.get() == null);

2.2 运行与分析

2.2.1 运行命令

在pom.xml中设置主类为MemoryManagementDemo

<build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>3.8.1</version>
                <configuration>
                    <source>${maven.compiler.source}</source>
                    <target>${maven.compiler.target}</target>
                </configuration>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-jar-plugin</artifactId>
                <version>3.2.0</version>
                <configuration>
                    <archive>
                        <manifest>
                            <addClasspath>true</addClasspath>
                            <mainClass>com.example.MemoryManagementDemo</mainClass>
                        </manifest>
                    </archive>
                </configuration>
            </plugin>
        </plugins>
    </build>

之后运行：

mvn clean package
java -Xmx512m -XX:+HeapDumpOnOutOfMemoryError \
    -Xlog:gc*=debug:file=gc_manage.log:time,uptime,level,tags \
    -jar target/java-jvm-tuning-1.0-SNAPSHOT.jar

2.2.2 运行结果

...
状态报告：已创建 11650 MB, 已回收 0 MB, 当前缓存大小 100
状态报告：已创建 11660 MB, 已回收 0 MB, 当前缓存大小 100
状态报告：已创建 11670 MB, 已回收 0 MB, 当前缓存大小 100

可以看到即使已经使用了这么大的内存还是没有发出溢出使用visualVM查看线程转储: ![[Pasted image 20241208000442.png]]

堆大小：266,266,080 B（约254MB）
byte[]数组：7,449个实例，占用265,619,288 B（约253MB）
这些是之前运行MemoryLeakDemo时生成的堆转储文件

原因：

使用软引用：

private static List<SoftReference<byte[]>> list = new ArrayList<>();

软引用在内存压力大时会被自动回收

限制集合大小：

private static final int MAX_CACHE_SIZE = 100;
if (list.size() >= MAX_CACHE_SIZE) {
    list.remove(0);
}

列表永远不会超过100个元素

主动清理：

list.removeIf(item -> item.get() == null);

定期清理已被GC回收的引用

这就是为什么即使创建了11690MB的对象，程序仍然在运行：

列表大小被限制在100
当内存压力大时，软引用会被自动回收
被回收的引用会被从列表中删除

如果像进一步分析MemoryManagementDemo的运行情况：

使用这个命令运行：

java -Xmx512m -XX:+HeapDumpOnOutOfMemoryError \
    -Xlog:gc*=debug:file=gc_manage.log:time,uptime,level,tags \
    -jar target/java-jvm-tuning-1.0-SNAPSHOT.jar

打开VisualVM：
- 在左侧找到正在运行的Java进程
- 点击"Sampler"标签页查看实时内存使用
- 观察堆内存的变化趋势 ![[Pasted image 20241208000909.png]]
内存使用情况：
- byte[] 数组：159,431,264 B (38.9%)，357,973个对象
- char[] 数组：84,935,816 B (20.7%)，609,600个对象
- int[] 数组：39,321,304 B (9.6%)，752,608个对象
软引用的使用：
- java.lang.ref.SoftReference：1,269,320 B (0.3%)，31,733个对象
- 这说明软引用机制正在工作
内存分布：
- 最大的对象类型仍然是byte[]，但占比只有38.9%
- 内存分布更加均匀，没有单一类型占据绝大部分内存

这与之前MemoryLeakDemo的堆转储相比有很大不同：

之前：byte[]占用了99.8%的内存
现在：byte[]只占38.9%，且对象数量可控

这说明我们的优化策略是有效的：

软引用允许JVM在内存压力大时回收对象
内存使用更加均衡
没有发生内存泄漏

2.2.3 GC日志分析

部分gc日志：

[2024-12-07T17:08:10.870+0800][2.889s][info ][gc,start          ] GC(24) Pause Young (Concurrent Start) (G1 Humongous Allocation)
[2024-12-07T17:08:10.870+0800][2.889s][info ][gc,task           ] GC(24) Using 12 workers of 18 for evacuation
[2024-12-07T17:08:10.870+0800][2.889s][debug][gc,tlab           ] GC(24) TLAB totals: thrds: 1  refills: 1 max: 1 slow allocs: 1 max 1 waste: 100.0% gc: 503192B max: 503192B slow: 0B max: 0B fast: 0B max: 0B
[2024-12-07T17:08:10.870+0800][2.889s][debug][gc,alloc,region   ] GC(24) Mutator Allocation stats, regions: 1, wasted size: 0B ( 0.0%)
[2024-12-07T17:08:10.870+0800][2.890s][debug][gc,age            ] GC(24) Desired survivor size 2097152 bytes, new threshold 15 (max threshold 15)
[2024-12-07T17:08:10.871+0800][2.890s][debug][gc,ergo,cset      ] GC(24) Finish choosing CSet. old: 0 regions, predicted old region time: 0.00ms, time remaining: 179.08
[2024-12-07T17:08:10.872+0800][2.891s][debug][gc,task,stats     ] GC(24) GC Termination Stats
[2024-12-07T17:08:10.872+0800][2.892s][debug][gc,task,stats     ] GC(24)      elapsed  --strong roots-- -------termination------- ------waste (KiB)------
[2024-12-07T17:08:10.873+0800][2.892s][debug][gc,task,stats     ] GC(24) thr     ms        ms      %        ms      %    attempts  total   alloc    undo
[2024-12-07T17:08:10.873+0800][2.892s][debug][gc,task,stats     ] GC(24) --- --------- --------- ------ --------- ------ -------- ------- ------- -------
[2024-12-07T17:08:10.876+0800][2.896s][debug][gc,task,stats     ] GC(24)  10      2.09      2.08  99.67      0.00   0.02        1       0       0       0
[2024-12-07T17:08:10.877+0800][2.896s][debug][gc,task,stats     ] GC(24)   5      2.87      0.87  30.23      1.71  59.43       35       0       0       0
[2024-12-07T17:08:10.878+0800][2.897s][debug][gc,task,stats     ] GC(24)   0      3.73      1.56  41.80      1.38  36.88        6       0       0       0
[2024-12-07T17:08:10.878+0800][2.898s][debug][gc,task,stats     ] GC(24)   4      4.39      1.35  30.80      1.36  30.98        1       0       0       0
[2024-12-07T17:08:10.879+0800][2.898s][debug][gc,task,stats     ] GC(24)   2      4.96      1.31  26.38      1.45  29.18       15       0       0       0
[2024-12-07T17:08:10.879+0800][2.898s][debug][gc,task,stats     ] GC(24)   8      5.04      1.51  30.07      0.84  16.70        1       0       0       0
[2024-12-07T17:08:10.880+0800][2.899s][debug][gc,task,stats     ] GC(24)   3      5.52      0.90  16.25      1.46  26.38       10       0       0       0
[2024-12-07T17:08:10.880+0800][2.899s][debug][gc,task,stats     ] GC(24)  11      5.58      1.58  28.39      0.69  12.34        1       0       0       0
[2024-12-07T17:08:10.880+0800][2.899s][debug][gc,task,stats     ] GC(24)   7      5.92      0.90  15.24      1.40  23.64        4       0       0       0
[2024-12-07T17:08:10.880+0800][2.899s][debug][gc,task,stats     ] GC(24)   6      6.12      0.83  13.50      1.28  20.84       11       0       0       0
[2024-12-07T17:08:10.881+0800][2.900s][debug][gc,task,stats     ] GC(24)   1      6.63      1.22  18.33      1.23  18.61        3       0       0       0
[2024-12-07T17:08:10.881+0800][2.900s][debug][gc,task,stats     ] GC(24)   9      6.48      1.76  27.09      0.34   5.30        1       0       0       0
[2024-12-07T17:08:10.881+0800][2.900s][debug][gc,ergo           ] GC(24) Running G1 Clear Card Table Task using 1 workers for 1 units of work for 3 regions.
[2024-12-07T17:08:10.881+0800][2.900s][debug][gc,ref            ] GC(24) Skipped phase1 of Reference Processing due to unavailable references
[2024-12-07T17:08:10.881+0800][2.900s][debug][gc,ref            ] GC(24) Skipped phase3 of Reference Processing due to unavailable references
[2024-12-07T17:08:10.881+0800][2.901s][debug][gc,ergo           ] GC(24) Running G1 Free Collection Set using 1 workers for collection set length 2
[2024-12-07T17:08:10.882+0800][2.901s][debug][gc,humongous      ] GC(24) Live humongous region 0 object size 1048592 start 0x00000000e0000000  with remset 0 code roots 0 is marked 0 reclaim candidate 0 type array 1

![[Pasted image 20241208002430.png]]

[2024-12-07T17:08:10.911+0800][2.930s][info ][gc                ] GC(24) Pause Young (Concurrent Start) (G1 Humongous Allocation) 461M->461M(505M) 41.652ms
[2024-12-07T17:08:10.911+0800][2.930s][info ][gc,cpu            ] GC(24) User=0.03s Sys=0.00s Real=0.04s
[2024-12-07T17:08:10.911+0800][2.931s][info ][gc                ] GC(25) Concurrent Cycle
[2024-12-07T17:08:10.911+0800][2.931s][info ][gc,marking        ] GC(25) Concurrent Clear Claimed Marks
[2024-12-07T17:08:10.912+0800][2.931s][info ][gc,marking        ] GC(25) Concurrent Clear Claimed Marks 0.092ms
[2024-12-07T17:08:10.912+0800][2.931s][info ][gc,marking        ] GC(25) Concurrent Scan Root Regions
[2024-12-07T17:08:10.912+0800][2.931s][debug][gc,ergo           ] GC(25) Running G1 Root Region Scan using 1 workers for 1 work units.
[2024-12-07T17:08:10.912+0800][2.931s][info ][gc,marking        ] GC(25) Concurrent Scan Root Regions 0.488ms
[2024-12-07T17:08:10.912+0800][2.931s][info ][gc,marking        ] GC(25) Concurrent Mark (2.931s)
[2024-12-07T17:08:10.912+0800][2.931s][info ][gc,marking        ] GC(25) Concurrent Mark From Roots
[2024-12-07T17:08:10.912+0800][2.932s][info ][gc,task           ] GC(25) Using 5 workers of 5 for marking
[2024-12-07T17:08:10.913+0800][2.932s][debug][gc,stats          ] ---------------------------------------------------------------------
[2024-12-07T17:08:10.913+0800][2.932s][debug][gc,stats          ] Marking Stats, task = 0, calls = 54
[2024-12-07T17:08:10.913+0800][2.932s][debug][gc,stats          ]   Elapsed time = 0.47ms, Termination time = 0.00ms
[2024-12-07T17:08:10.913+0800][2.932s][debug][gc,stats          ]   Step Times (cum): num = 1232, avg = 0.01ms, sd = 0.14ms max = 3.57ms, total = 10.87ms
[2024-12-07T17:08:10.913+0800][2.933s][debug][gc,stats          ]   Mark Stats Cache: hits 3180 misses 231 ratio 93.228
[2024-12-07T17:08:10.913+0800][2.933s][debug][gc,stats          ] ---------------------------------------------------------------------
[2024-12-07T17:08:10.914+0800][2.933s][debug][gc,stats          ] Marking Stats, task = 1, calls = 1
[2024-12-07T17:08:10.914+0800][2.933s][debug][gc,stats          ]   Elapsed time = 0.45ms, Termination time = 0.00ms
[2024-12-07T17:08:10.914+0800][2.933s][debug][gc,stats          ]   Step Times (cum): num = 297, avg = 0.04ms, sd = 0.29ms max = 3.56ms, total = 10.71ms
[2024-12-07T17:08:10.914+0800][2.933s][debug][gc,stats          ]   Mark Stats Cache: hits 4216 misses 2 ratio 99.953
[2024-12-07T17:08:10.914+0800][2.933s][debug][gc,stats          ] ---------------------------------------------------------------------
[2024-12-07T17:08:10.914+0800][2.933s][debug][gc,stats          ] Marking Stats, task = 2, calls = 44
[2024-12-07T17:08:10.914+0800][2.933s][debug][gc,stats          ]   Elapsed time = 0.45ms, Termination time = 0.00ms
[2024-12-07T17:08:10.914+0800][2.933s][debug][gc,stats          ]   Step Times (cum): num = 432, avg = 0.02ms, sd = 0.23ms max = 3.57ms, total = 10.27ms
[2024-12-07T17:08:10.914+0800][2.933s][debug][gc,stats          ]   Mark Stats Cache: hits 605 misses 2 ratio 99.671
[2024-12-07T17:08:10.914+0800][2.933s][debug][gc,stats          ] ---------------------------------------------------------------------
[2024-12-07T17:08:10.914+0800][2.934s][debug][gc,stats          ] Marking Stats, task = 3, calls = 46
[2024-12-07T17:08:10.914+0800][2.934s][debug][gc,stats          ]   Elapsed time = 0.43ms, Termination time = 0.00ms
[2024-12-07T17:08:10.915+0800][2.934s][debug][gc,stats          ]   Step Times (cum): num = 354, avg = 0.03ms, sd = 0.27ms max = 3.57ms, total = 10.69ms
[2024-12-07T17:08:10.915+0800][2.934s][debug][gc,stats          ]   Mark Stats Cache: hits 617 misses 1 ratio 99.838
[2024-12-07T17:08:10.915+0800][2.934s][debug][gc,stats          ] ---------------------------------------------------------------------
[2024-12-07T17:08:10.915+0800][2.934s][debug][gc,stats          ] Marking Stats, task = 4, calls = 38
[2024-12-07T17:08:10.915+0800][2.934s][debug][gc,stats          ]   Elapsed time = 0.43ms, Termination time = 0.00ms
[2024-12-07T17:08:10.915+0800][2.934s][debug][gc,stats          ]   Step Times (cum): num = 383, avg = 0.03ms, sd = 0.25ms max = 3.57ms, total = 10.33ms
[2024-12-07T17:08:10.915+0800][2.934s][debug][gc,stats          ]   Mark Stats Cache: hits 1868 misses 1 ratio 99.946
[2024-12-07T17:08:10.915+0800][2.934s][debug][gc,stats          ] ---------------------------------------------------------------------
[2024-12-07T17:08:10.915+0800][2.934s][info ][gc,marking        ] GC(25) Concurrent Mark From Roots 2.972ms
[2024-12-07T17:08:10.915+0800][2.935s][info ][gc,marking        ] GC(25) Concurrent Preclean
[2024-12-07T17:08:10.915+0800][2.935s][debug][gc,ref,start      ] GC(25) Preclean SoftReferences
[2024-12-07T17:08:10.916+0800][2.935s][debug][gc,ref            ] GC(25) Preclean SoftReferences 0.094ms
[2024-12-07T17:08:10.916+0800][2.935s][debug][gc,ref,start      ] GC(25) Preclean WeakReferences

从日志中我们可以看到一次典型的G1 GC过程：

GC触发原因和基本信息：

[2024-12-07T17:08:10.870+0800] Pause Young (Concurrent Start) (G1 Humongous Allocation)

这是一次年轻代GC，触发原因是分配巨型对象（我们的1MB byte数组）
同时开始了并发标记周期（Concurrent Start）

GC前后的内存变化：

[2024-12-07T17:08:10.890+0800][1.238s][debug][gc,heap] GC(0) Heap before GC invocations=0 (full 0): 
garbage-first heap total 516096K, used 231120K [0x00000000e0000000, 0x0000000100000000)

GC前使用了约231MB内存（231120K）
总堆大小约516MB（516096K）

GC性能指标：

[2024-12-07T17:08:10.892+0800][1.240s][debug][gc,task,stats] GC(0) GC Termination Stats

从后续的任务统计可以看到：

最快的线程用时约2.09ms（线程10）
最慢的线程用时约6.63ms（线程1）
平均暂停时间在4-5ms左右

并发标记阶段：

[2024-12-07T17:08:11.061+0800][3.081s][info][gc,marking] GC(27) Concurrent Mark From Roots
[2024-12-07T17:08:11.062+0800][3.081s][info][gc,task] GC(27) Using 5 workers of 5 for marking

使用5个线程进行并发标记
标记缓存命中率很高（93%-99%）

内存回收效果：

GC(1348) Pause Young (Concurrent Start) (G1 Humongous Allocation) 229M->201M(512M) 29.514ms

这行显示：

GC前：229MB
GC后：201MB
总堆大小：512MB
这次GC耗时：29.514ms

2.2.4 有趣的现象

不知道大家有没有注意到这里触发的gc回收竟然是young gc, 回顾我们的程序，我们创建的是byte[] data = new byte[1024 * 1024];查看gc日志发现heap region的size才1MB，那我们这个创建的就是一个巨型对象，会直接分配在老年代的连续区域中

这就是为什么日志中有G1 Humongous Allocation的标记

为什么触发Young GC：

当尝试分配巨型对象时，G1首先需要找到连续的空闲regions
如果没有足够的连续空闲regions，G1会触发Young GC来尝试回收空间
这就是为什么我们看到Pause Young (Concurrent Start) (G1 Humongous Allocation)

关键在于我们使用了软引用：

SoftReference<byte[]> ref = new SoftReference<>(data);

软引用对象本身是在年轻代分配的
当内存压力大时，G1会回收软引用指向的对象
即使这些对象在老年代，也可以通过回收软引用来间接回收它们

查看日志中的证据：

[2024-12-07T17:08:10.915+0800][2.935s][debug][gc,ref,start] GC(25) Preclean SoftReferences
[2024-12-07T17:08:10.916+0800][2.935s][debug][gc,ref] GC(25) Preclean SoftReferences 0.094ms

这表明G1在GC过程中专门处理了软引用。

所以整个过程是：

尝试分配巨型对象
发现没有足够的连续空间
触发Young GC
Young GC过程中发现内存压力大
处理软引用，回收之前分配的巨型对象
为新的巨型对象分配空间

这就是为什么即使是巨型对象，通过软引用的机制，我们仍然能在Young GC中间接地回收它们。这也是我们的内存优化策略能够工作的关键原因。

2.3 优化效果对比

优化项	优化前	优化后
GC频率	频繁	稳定
暂停时间	不稳定	18-86ms
内存使用	持续增长	201-229MB波动
OOM风险	高	低

2.4 优化建议

2.4.1 G1收集器参数调优

# 1. 调整region大小
-XX:G1HeapRegionSize=4m  # 默认1-32MB，根据堆大小自动计算

# 2. 设置目标暂停时间
-XX:MaxGCPauseMillis=100  # 默认200ms

# 3. 调整新生代大小
-XX:G1NewSizePercent=30   # 默认5%
-XX:G1MaxNewSizePercent=60  # 默认60%

# 4. 并发标记线程数
-XX:ConcGCThreads=2  # 默认为CPU核数的1/4

2.4.2 代码优化步骤

使用软引用：

SoftReference<byte[]> ref = new SoftReference<>(new byte[1024 * 1024]);

限制集合大小：

private static final int MAX_CACHE_SIZE = 100;
if (list.size() >= MAX_CACHE_SIZE) {
    list.remove(0);
}

分块处理大数据：

int blockSize = 512 * 1024; // 512KB
for (int i = 0; i < totalSize; i += blockSize) {
    byte[] block = new byte[Math.min(blockSize, totalSize - i)];
    // 处理数据块
}

2.4.3 监控建议

添加GC详细日志：

-Xlog:gc*=debug:file=gc.log:time,uptime,level,tags
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps

设置堆内存转储：

-XX:GCHeapDumpOnOutOfMemory
-XX:HeapDumpPath=/path/to/dumps