前二天早上tomcat突然掛掉,log 顯示如下:
20-Apr-2022 10:22:41.714 嚴重 [https-jsse-nio2-8443-Acceptor-0] org.apache.tomcat.util.net.Acceptor.run Socket accept failed java.util.concurrent.ExecutionException: java.io.IOException: 開啟太多檔案 at sun.nio.ch.CompletedFuture.get(CompletedFuture.java:69) at org.apache.tomcat.util.net.Nio2Endpoint.serverSocketAccept(Nio2Endpoint.java:334) at org.apache.tomcat.util.net.Nio2Endpoint.serverSocketAccept(Nio2Endpoint.java:59) at org.apache.tomcat.util.net.Acceptor.run(Acceptor.java:95) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.IOException: 開啟太多檔案 at sun.nio.ch.UnixAsynchronousServerSocketChannelImpl.accept0(Native Method) at sun.nio.ch.UnixAsynchronousServerSocketChannelImpl.accept(UnixAsynchronousServerSocketChannelImpl.java:344) at sun.nio.ch.UnixAsynchronousServerSocketChannelImpl.implAccept(UnixAsynchronousServerSocketChannelImpl.java:280) at sun.nio.ch.AsynchronousServerSocketChannelImpl.accept(AsynchronousServerSocketChannelImpl.java:125) ... 4 more
初步調查是tomcat開啟了太多檔案,造成"too many open files" (開啟太多檔案),網路上一般說法是說要調整系統的ulimit上限才行,但想想不太對,這系統會需要同時開啟的檔案不多,而且運行了幾年也沒有發生file leak的狀況。
先找出tomcat的pid,再用 ls -al /proc/<pid>/fd | wc -l,就可以知道tomcat目前開啟多少檔案(包含unix socket file)。
後來發現是tomcat自已的bug,會一直存取tomcat-users.xml卻沒有釋放資源,解法是升級 tomcat 9 >= 9.0.14+,或修改 server.xml,加入watchSource="false":
<resource auth="Container" description="User database that can be updated and saved" factory="org.apache.catalina.users.MemoryUserDatabaseFactory" name="UserDatabase" pathname="conf/tomcat-users.xml" type="org.apache.catalina.UserDatabase" watchsource="false">ref:
https://bz.apache.org/bugzilla/show_bug.cgi?id=62924#c2
https://zhuanlan.zhihu.com/p/75897823
https://www.cnblogs.com/linus-tan/p/10328801.html