.net垃圾回收和CLR 4.0对垃圾回收所做的改进之二

A survey of garbage collection and the changes CLR 4.0 brings in Part 2 - series of what is new in CLR 4.0

接前篇Continue the previous post .net垃圾回收和CLR 4.0对垃圾回收所做的改进之一

CLR4.0所带来的变化仍然没有在这篇,请看下篇。

内存释放和压缩

创建对象引用图之后,垃圾回收器将那些没有在这个图中的对象(即不再需要的对象)释放。释放内存之后, 出现了内存碎片, 垃圾回收器扫描托管堆,找到连续的内存块,然后移动未回收的对象到更低的地址, 以得到整块的内存,同时所有的对象引用都将被调整为指向对象新的存储位置。这就象一个夯实的动作。

After building up the reference relationship graph, garbage collector reclaims the objects not in the graph(no longer needed), after releasing the objects not in the graph, there is memory scrap. Garbage collector scans the managed heap to find continous memory block, and shifts the remaining objects to lower address to get consecutive memory space, and then adjusts the references of objects according to the shifted address of objects. This is looking like a tamping on the managed heap.

下面要说到的是代的概念。代概念的引入是为了提高垃圾收集器的整体性能。We come to the concept of generations next. The importing of generation concept is to improve the performance of garbage collector.

代Generations

请想一想如果垃圾收集器每次总是扫描所有托管堆中的对象,对性能会有什么影响。会不会很慢?是的。微软因此引入了代的概念。

Please think about what will happen if garbage collector scans all the objects in the whole heap in every garbage collecting cycle. Will it be very slow? Yes, therefore Microsoft imported the concept of generations.

为什么代的概念可以提高垃圾收集器的性能?因为微软是基于对大量编程实践的科学估计,做了一些假定而这些假定符合绝大多数的编程实践:

Why generation concept can help improve performance of garbage collector? Because Microsoft did scientific valuation on mass of programming practice, and made assumptions and the assumptions conform to most of programming practice:

  • 越新的对象,其生命周期越短。The newer an object is, the shorter its lifetime will be.
  • 越老的对象,其生命周越长。The older an object is, the longer its lifetime will be.
  • 新对象之间通常有强的关系并被同时访问。Newer objects tend to have strong relationships to each other and are frequently accessed around the same time.
  • 压缩一部分堆比压缩整个堆快。Compacting a portion of the heap is faster than compacting the whole heap.

有了代的概念,垃圾回收活动就可以大部分局限于一个较小的区域来进行。这样就对垃圾回收的性能有所提高。After importing the concept of generations, most of garbage collecting will be limited in in smaller range of memory. This enhances the performance of garbage collector.

让我们来看垃圾收集器具体是怎么实现代的: Let’s see how generations are exactly implemented in garbage collector:

第0代:新建对象和从未经过垃圾回收对象的集合   Generation 0: A collection of newly created object and the objects never collected.

第1代:在第0代收集活动中未回收的对象集合  Generation 1: A collection of objects not collected by garbage collector in collecting cycle of generation 0.

第2代:在第1和第2代中未回收的对象集合, 即垃圾收集器最高只支持到第2代, 如果某个对象在第2代的回收活动中留下来,它仍呆在第2代的内存中。 Generation 2: A collection of objects not collected by garbage collector in generation 1 and generation 2. This means the highest generation that garbage collector supports is generation 2. If an object survives in generation 2 collecting cycle, it still remains in memory of generation 2.

当程序刚开始运行,垃圾收集器分配为每一代分配了一定的内存,这些内存的初始大小由.net framework的策略决定。垃圾收集器记录了这三代的内存起始地址和大小。这三代的内存是连接在一起的。第2代的内存在第1代内存之下,第1代内存在第0代内存之下。应用程序分配新的托管对象总是从第0代中分配。如果第0代中内存足够,CLR就很简单快速地移动一下指针,完成内存的分配。这是很快速的。当第0代内存不足以容纳新的对象时,就触发垃圾收集器工作,来回收第0代中不再需要的对象,当回收完毕,垃圾收集器就夯实第0代中没有回收的对象至低的地址,同时移动指针至空闲空间的开始地址(同时按照移动后的地址去更新那些相关引用),此时第0代就空了,因为那些在第0代中没有回收的对象都移到了第1代。

When the program initializes, garbage collector allocates memory for generations. The initial size of memory blocks are determined according to the strategies of the .net framework. Garbage collector records the start address and size of the memory block for generations. The memory blocks of generations are continuous and adjacent. The memory of generation 2 is under the memory of generation 1, and the memory of generation 1 is under the memory of generation 0. CLR always allocates memory for new objects in generation 0. If there is enough memory in generation 0, CLR simply moves the pointer to allocate memory. This is really fast. When there is not enough memory in generation 0 to accommodate new objects, CLR triggers garbage collector starts to collect objects no longer needed from generation 0. When the collecting action in generation 0 finishs, garbage collector tamps(or compacts) the objects not collected in generation 0 to lower address, and moves the pointer to start address of free memory(and updates the related references according to the shifted address of objects). At this time, generation 0 is empty, because the objects survived in generation 0 are moved to generation 1.

当只对第0代进行收集时,所发生的就是部分收集。这与之前所说的全部收集有所区别(因为代的引入)。对第0代收集时,同样是从根开始找那些正引用的对象,但接下来的步骤有所不同。当垃圾收集器找到一个指向第1代或者第2代地址的根,垃圾收集器就忽略此根,继续找其他根,如果找到一个指向第0代对象的根,就将此对象加入图。这样就可以只处理第0代内存中的垃圾。这样做有个先决条件,就是应用程序此前没有去写第1代和第2代的内存,没有让第1代或者第2代中某个对象指向第0代的内存。但是实际中应用程序是有可能写第1代或者第2代的内存的。针对这种情况,CLR有专门的数据结构(Card table)来标志应用程序是否曾经写第1代或者第2代的内存。如果在此次对第0代进行收集之前,应用程序写过第1代或者第2代的内存,那些被Card Table登记的对象(在第1代或者第2代)将也要在此次对第0代收集时作为根。这样,才可以正确地对第0代进行收集。

When collecting generation 0 only, it is partial collection. It is different from full collection mentioned earlier(because of the generations). When collecting generation 0, garbage collector starts from the roots, which is the same as the full collection, but it is different in coming steps. When garbage collector finds a root pointing to an address of generation 1 or 2, garbage collector ignores the root, and goes to next root. If garbage collector finds a root pointing to an object of generation 0, garbage collector addes the object into the graph. That way garbage collector processes the objects of generation 0 only. There is a pre-condition to do that. It is that the application does not write to the memory of generation 1 and 2, does not allow some objects of generation 1 or 2 refer to the memory of generation 0. But in our daily work, the applicaiton is possible to write the memory of generation 1 or 2. In this case, CLR has a dedicated data structure called Card Table to record whether the application writes the memory of generation 1 or 2. If the application writes the memory of generation 1 or 2 before the collecting on generation 0, the objects recorded by the Card Table will become roots during the collecting on generation 0. Garbage collection on generation 0 can be done correctly in this case.

以上说到了第0代收集发生的一个条件,即第0代没有足够内存去容纳新对象。执行GC.Collect()也会触发对第0代的收集。另外,垃圾收集器还为每一代都维护着一个监视阀值。第0代内存达到这个第0代的阀值时也会触发对第0代的收集。对第1代的收集发生在执行GC.Collect(1)或者第1代内存达到第1代的阀值时。第2代也有类似的触发条件。当第1代收集时,第0代也需要收集。当第2代收集时,第1和第0代也需要收集。在第n代收集之后仍然存留下来的对象将被转移到第n+1代的内存中,如果n=2, 那么存留下来的对象还将留在第2代中。

We mentioned a criteria to trigger collecting on generation 0 in above paragraphs: generation 0 does not have enough memory to accommodate new objects. When execute GC.Collect(), it launches collecting on generation 0 also. In addition, garbage collector sets up a threshold for each of generations. When the memory of generation 0 reaches the threshold, collecting on generation 0 happens also. Collecting on generation 1 happens when executing GC.Collect() or the memory of generation 1 reaches the threshold of generation 1. Generation 2 has similar trigger conditions. When collecting on generation 1, collecting on generation 0 happens also. When collecting on generation 2, collecting on generation 1 and 0 happen also. The survived object in collecting generation n will be moved to the memory of generation n+1. If n=2, the remaining objects still stay in generation 2.

对象结束Finalization of objects

对象结束机制是程序员忘记用Close或者Dispose等方法清理申请的资源时的一个保证措施。如下的一个类,当一个此类的实例创建时,在第0代中分配内存,同时此对象的引用要被加入到一个由CLR维护的结束队列中去。

Finalization is an ensuring mechanism when programmers forget to use Close or Dispose method to clean up resources. For exmaple, a class like the following, when an instane of the class is created, it is allocated in memory of generation 0, and a reference of the object is appended to Finalization queue maintained by CLR.

 BaseObj {
     BaseObj() { } 
     Finalize() {
        
Console.WriteLine("In Finalize.");
    }
}

当此对象成为垃圾时,垃圾收集器将其引用从结束队列移到待结束队列中,同时此对象会被加入引用关系图。一个独立运行的CLR线程将一个个从待结束队列(Jeffrey Richter称之为Freachable queue)取出对象,执行其Finalize方法以清理资源。因此,此对象不会马上被垃圾收集器回收。只有当此对象的Finalize方法被执行完毕后,其引用才会从待结束队列中移除。等下一轮回收时,垃圾回收器才会将其回收。

When the object becomes garbage, garbage collector moves the reference from Finalization queue to ToBeFinalized queue(Jeffrey Richter called it Freachable queue), and appends the object to the reference graph. A standalone thread of CLR will fetch objects from the ToBeFinalized queue one by one, and execute the Finalize() method of objects to clean up resources. Therefore, the object will not be collected right away by garbage collector. After the Finalize() method is executed, its reference will be removed from the ToBeFinalizaed queue. When next collecting comes, garbage collector reclaims its memory.

GC类有两个公共静态方法GC.ReRegisterForFinalize和GC.SuppressFinalize大家也许想了解一下,ReRegisterForFinalize是将指向对象的引用添加到结束队列中(即表明此对象需要结束),SuppressFinalize是将结束队列中该对象的引用移除,CLR将不再会执行其Finalize方法。

There are two public static methods of GC class you guys may want to know: GC.ReRegisterForFinalize and GC.SuppressFinalize. ReRegisterForFinalize is to append the reference of objects to finalization queue(meaning the objects need to be finalized), SuppressFinalize is to remove the reference of objects from finalization queue, then CLR would not execute the Finalize method of the object.

因为有Finalize方法的对象在new时就自动会加入结束队列中,所以ReRegisterForFinalize可以用的场合比较少。ReRegisterForFinalize比较典型的是配合重生(Resurrection)的场合来用。重生指的是在Finalize方法中让根又重新指向此对象。那么此对象又成了可到达的对象,不会被垃圾收集器收集,但是此对象的引用未被加入结束队列中。所以此处需要用ReRegisterForFinalize方法来将对象的引用添加到结束队列中。因为重生本身在现实应用中就很少见,所以ReRegisterForFinalize也将比较少用到。

Because the objects with Finalize method will be appended to Finalization queue when new operation, there are few scenarios to use ReRegisterForFinalize method. A typical scenario is to use ReRegisterForFinalize with Resurrection. Resurrection is that we let a root point to the object again in Finalize method, and then the object becomes reachable again, therefore it will be not collected by garbage collector. But the reference of the object has not been appended to Finalization queue, therefore we can use ReRegisterForFinalize to append the object to Finalization queue. Because there are few requirement in reality to use resurrection, ReRegisterForFinalize will be used in low rate.

相比之下,SuppressFinalize更常用些。SuppressFinalize用于同时实现了Finalize方法和Dispose()方法来释放资源的情况下。在Dispose()方法中调用GC.SuppressFinalize(this),那么CLR就不会执行Finalize方法。Finalize方法是程序员忘记用Close或者Dispose等方法清理资源时的一个保证措施。如果程序员记得调用Dispose(),那么就会不执行Finalize()来再次释放资源;如果程序员忘记调用Dispose(), Finalize方法将是最后一个保证资源释放的措施。这样做不失为一种双保险的方案。

Comparing to ReRegisterForFinalize, SuppressFinalize has more frequent utilization. When we implement both Finalize method and Dispose method to release resources, we need to use SuppressFinalize method. Call GC.SuppressFinalize(this) in the body of Dispose() method and then CLR will not execute the Finalize method. Finalization is an ensuring mechanism when programmers forget to use Close or Dispose method to clean up resources. If programmers do call Dispose(), then CLR will not call Finalize method to release resources again. If programmers forget to call Dispose(), then Finalize method will be the final ensuring mechnism for resource releasing. That way it is a dual fail-safe solution.

对象结束机制对垃圾收集器的性能影响比较大,同时CLR难以保证调用Finalize方法的时间和次序。因此,尽量不要用对象结束机制,而采用自定义的方法或者名为Close, Dispose的方法来清理资源。可以考虑实现IDisposable接口并为Dispose方法写好清理资源的方法体。

Finalization has significant impact on performance of garbage collector, and CLR can not be sure on the time and order to call Finalize methods of objects, therefore please do not use finalization of objects as possible as you can, instead, you could use self defined methods, Close method or Dispose method to clean up resources. Please think about to implement the IDisposable interface and write method body for the Dispose method to clean up resources.

大对象堆Large object heap

大对象堆专用于存放大于85000字节的对象。初始的大对象内存区域堆通常在第0代内存之上,并且与第0代内存不邻接。第0,第1和第2代合起来称为小对象堆。CLR分配一个新的对象时,如果其大小小于85000字节,就在第0代中分配,如果其大小大于等于85000自己,就在大对象堆中分配。

Large object heap is to store objects that its size is over 85000 bytes. The initial memory block of large object heap is above the memory block of generation 0, and it is not adjacent to memory block of generation 0. Generation 0,1 and 2 is called small object heap. When CLR allocates a new object, if its size is lower than 85000 bytes, then allocates memory in generation 0; If its size is over 85000 bytes, then allocates memory in large object heap.

因为大对象的尺寸比较大,收集时成本比较高,所以对大对象的收集是在第2代收集时。大对象的收集也是从根开始查找可到达对象,那些不可到达的大对象就可回收。垃圾收集器回收了大对象后,不会对大对象堆进行夯实操作(毕竟移动大对象成本较高),而是用一个空闲对象表的数据结构来登记哪些对象的空间可以再利用,其中两个相邻的大对象回收将在空闲对象表中作为一个对象对待。空闲对象表登记的空间将可以再分配新的大对象。

Because size of large object is significant, the cost of collection is significant also. Collection of large objects happens when collecting generation 2.  Collection of large objects starts from the roots also and searches for reachable objects. Non-reachable large objects will be collected. After collecting non-reachable large objects, garbage collector will not tamp the large object heap(because the cost of moving a large object is high), instead, garbage collector uses a free object table to record memory ranges that can be re-used, if there are two adjacent large object collected, then treats the two large objects as one large object in free object table. The memory ranges in free object table can be re-used by new large objects.

大对象的分配,回收的成本都较小对象高,因此在实践中最好避免很快地分配大对象又很快回收,可以考虑如何分配一个大对象池,重复利用这个大对象池,而不频繁地回收。

The cost of allocation and collection of large objects is higher than the cost of allocation and collection of small objects, therefore it would better avoid to allocate large object and release it soon. Please think about allocate a pool of large objects, try to re-use the pool of large objects, do not frequently reclaim large objects.

 

未完待续To be continued…

参考文献References

Garbage Collection: Automatic Memory Management in the Microsoft .NET Framework By Jeffrey Richter  http://msdn.microsoft.com/en-us/magazine/bb985010.aspx

Garbage Collection Part 2: Automatic Memory Management in the Microsoft .NET Framework By Jeffrey Richter http://msdn.microsoft.com/en-us/magazine/bb985011.aspx

Garbage Collector Basics and Performance Hints By Rico Mariani at Microsoft  http://msdn.microsoft.com/en-us/library/ms973837.aspx

http://drowningintechnicaldebt.com/blogs/royashbrook/archive/2007/06/22/top-20-net-garbage-collection-gc-articles.aspx

Large Object Heap Uncovered By Maoni Stephens http://msdn.microsoft.com/en-us/magazine/cc534993.aspx

Garbage collection in msdn http://msdn.microsoft.com/en-us/library/0xy59wtx.aspx

CLR4.0所带来的变化仍然没有在这篇,请看下篇。

The changes CLR 4.0 brings in are not in this post, please read the next post.

时间: 2024-09-17 04:29:40

.net垃圾回收和CLR 4.0对垃圾回收所做的改进之二的相关文章

.net垃圾回收和CLR 4.0对垃圾回收所做的改进之三

A survey of garbage collection and the changes CLR 4.0 brings in Part 3 - series of what is new in CLR 4.0 接前篇Continue the previous posts .net垃圾回收和CLR 4.0对垃圾回收所做的改进之一 .net垃圾回收和CLR 4.0对垃圾回收所做的改进之二 弱引用Weak Reference 弱引用是相对强引用来说的.强引用指的是根有一个指针指向对象.弱引用是通过

.net垃圾回收和CLR 4.0对垃圾回收所做的改进之一

A survey of garbage collection and the changes CLR 4.0 brings in - series of what is new in CLR 4.0 导言Introduction   垃圾回收(Garbage Collection)在.net中是一个很重要的机制. 本文将要谈到CLR4.0对垃圾回收做了哪些改进. 为了更好地理解这些改进, 本文也要介绍垃圾回收的历史.  这样我们对整个垃圾回收有一个大的印象. 这个大印象对于我们掌握.net架构是

浅谈CLR的内存分配和回收机制

相对于C++程序员来说,C#程序员是非常幸运的,至少我们不需要为内存泄漏(Memory Leak)而头疼,不需要负责内存的分配和回收.但这不意味着我们只需要知道new的语法 就可以了,作为一个严肃的C#程序员,我们应该对此有所了解,有助于我们编写性能更好 的代码. 主要内容: CLR的内存分配机制 CLR的回收机制 一.CLR的内存分配机制 .NET Framework 的垃圾回收器管理应用程序的内存分配和释放.每次使用 new 运算 符创建对象时,运行库都从托管堆为该对象分配内存.只要托管堆中

“正规优化”也可能会促发百度Web2.0反垃圾机制

seo是一种较为流行的网络营销方式,也是网络营销中比较廉价.稳定的一种形式,在众多seo人员看来,seo并非很可靠.并非稳定,因为算法的每次调整可能都会殃及到网站,关键词排名的起伏已在站长看来是一件非常寻常的事.最近百度站长平台发布了百度Web2.0反垃圾攻略,导致很多网站消失在人们的视野中. 百度推出Web2.0反垃圾攻略对seo是否有较大的影响,对于大多数网站来说可能受到的影响不是很大,而一些利用群发外链的网站来说可能受到的伤害会很大,因为当这些外链是百度这次严打的对象;而对于一些利用"正规

对百度发布的web2.0反垃圾攻略所感

你是否有这样的感受,当你在百度搜索一个东西或是一个问题的时候,返回的结果时常都不让人满意,有的都是在打广告,有的是文不对题,更有的是一个跳转链接.现在的网络上真的是充斥着太多的垃圾. 对于百度发布的web2.0反垃圾攻略旨在打击那些不断制造垃圾信息的seo,虽然我也是做seo的,这样做对我们以后的工作也会增加一定的难度,但是我对百度的这一做法还是深表赞同的. 接触这个行业也有一两年了,我就有一个深深的感觉我每天都在制造垃圾,真的,又是迫不得已,因为别人都在这么做,你不做你就会落后,就像现在的小孩

百度web2.0反垃圾攻略,垃圾站将会死亡

5月2日百度发布了<web2.0反垃圾攻略>表明百度将会加大打击垃圾站的力度.4月份很多网站的收录大幅减少,A5这些大站也不例外,众多博客站被降权,还有淘宝客网站也受到很大影响.目前大部分网站的收录已恢复正常,被删的收录页面也陆续放出来了.4月份以来百度算法的不断更新也暗示着百度算法会有重大调整. 群发链接无效.百度在<web2.0反垃圾攻略>中列举了众多群发外链的例子.比如我们站长最常做的博客外链群发,论坛外链群发,分类外链群发.百度在未来会通过不断更新算法让垃圾外链消失在百度搜

百度Web2.0反垃圾攻略打击垃圾信息页面

大家好,我是木子成舟.SEO行业中最怕的就是搜索引擎算法调整,因为一旦算法调整我们就需要针对搜索引擎的算法调整及时的做出转变,然后搜索引擎的算法调整又是真正利用SEO技术做好网站推广工作的福音,因为算法的调整绝对是针对搜索结果质量的调整,针对SEO作弊手法的针对制约,只要我们充分运用好白帽SEO手段,不制造垃圾信息页面,搜索引擎的调整只会利于我们正规SEO的工作. 不知道是不是为了顺应潮流,还是学习跟风,百度在谷歌推出Panda算法和企鹅算法之后,也推出了自己的一套打击垃圾信息页面的算法,主要是

CLR 4.0中的新内容:状态错乱异常

状态错乱异常 有人叫它超级异常. 指的是未捕获异常, 打乱了程序的状态, 引起程序崩溃, 或者导致不想看到的程序行为, 如同神经错乱. CLR4.0针对未捕获异常做了一种可配置的处理机制. 请看下面的程序. 在CLR2.0里, 这个catch (Exception ex) 将所有可能发生的异常都捕获. 在CLR4.0里, 默认情况下这个超级catch不会生效, 一旦出现异常就会导致程序停止. class Program { static void Main(string[] args) { Sa

IIS6.0应用程序池回收设置分析_win服务器

问题如下: 1.网页上显示 您试图在此 Web 服务器上访问的 Web 应用程序当前不可用.请点击 Web 浏览器中的"刷新"按钮重试您的请求. 管理员注意事项: 详述此特定请求失败原因的错误信息可在 Web 服务器的系统事件日志中找到.请检查此日志项以查明导致该错误发生的原因. 2.windows事件查看器-应用程序Log The state server has closed an expired TCP/IP connection. The IP address of the c