> For the complete documentation index, see [llms.txt](https://huataihuang.gitbook.io/cloud-atlas-draft/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://huataihuang.gitbook.io/cloud-atlas-draft/os/linux/security/audit/find_inode_consume.md).

# 找出inode消耗原因

在生产环境上，如果出现inode大量消耗的情况，是需要通过快速定位[找出使用inode最多的文件目录](https://github.com/huataihuang/cloud-atlas-draft/tree/d20c619da83648a4cc9d8a2fc1ecb3c2188f538f/os/linux/security/audit/os/linux/storage/filesystem/ext/find_where_inodes_are_being_used.md)。但是，我们还需要进一步找出大量生成文件的原因。

我们已经定位了大量文件存在于 `tmp/exec` 目录，现在我们排查文件写入的进程来源。

虽然文件大量快速生成，但是进程写文件都是瞬间完成，无法简单通过iotop找出源头。所以，我们同样需要使用 [audit](https://github.com/huataihuang/cloud-atlas-draft/tree/d20c619da83648a4cc9d8a2fc1ecb3c2188f538f/os/linux/security/audit/audit_architecture/README.md) 来审计记录文件生成来源。

* 启用audit

```
systemctl start auditd
```

* 然后记录谁写了这个目录

```
auditctl -w /tmp/exec -S execve
```

提示错误

```
WARNING - 32/64 bit syscall mismatch, you should specify an arch
Error sending add rule data request (Rule exists)
```

改成

```
auditctl -w /tmp/exec -F arch=b64 -S execve
```

提示

```
arch must be before -S
```

* 添加规则

```
auditctl -a create -F arch=b64 -S execve -k MYEXEC
auditctl -w /tmp/exec -p war -k whatsgoingon
```

* 显示当前rule

```
auditctl -l
```

清理rule

```
auditctl -D
```

* 记录下

```
type=CWD msg=audit(1596944388.142:53167): cwd="/"
type=PATH msg=audit(1596944388.142:53167): item=0 name="/tmp/exec/" inode=920446 dev=fd:01 mode=040755 ouid=0 ogid=0 rdev=00:00 nametype=PARENT
type=PATH msg=audit(1596944388.142:53167): item=1 name="/tmp/exec/0bcdb0aa41ab2fbd1ac798559f8ff0258c470f55c206a6de953ff2ca31e89222.done" inode=589773 dev=fd:01 mode=0100755 ouid=0 ogid=0 rdev=00:00 nametype=CREATE
type=PROCTITLE msg=audit(1596944388.142:53167): proctitle=646F636B65722D686F73742D6578656300306263646230616134316162326662643161633739383535396638666630323538633437306635356332303661366465393533666632636133316538393232320066616C7365
type=USER_END msg=audit(1596944388.143:53168): pid=37943 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:session_close grantors=pam_keyinit,pam_limits acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=? res=success'
type=CRED_DISP msg=audit(1596944388.143:53169): pid=37943 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:setcred grantors=pam_env,pam_unix acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=? res=success'
type=SYSCALL msg=audit(1596944388.144:53170): arch=c000003e syscall=257 success=yes exit=3 a0=ffffffffffffff9c a1=c0005c62d0 a2=80241 a3=1ff items=2 ppid=54667 pid=37768 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="exe" exe="/usr/local/bin/pouchd" key="whatsgoingon"

type=CWD msg=audit(1596944388.150:53171): cwd="/"
type=PATH msg=audit(1596944388.150:53171): item=0 name="/tmp/exec/" inode=920446 dev=fd:01 mode=040755 ouid=0 ogid=0 rdev=00:00 nametype=PARENT
type=PATH msg=audit(1596944388.150:53171): item=1 name="/tmp/exec/7cdd887ff4f5f2013521782b7f3fe49d8de6a1fb12a20198b0d5d5725c4b6821.stdout" inode=589775 dev=fd:01 mode=0100644 ouid=0 ogid=0 rdev=00:00 nametype=CREATE
type=PROCTITLE msg=audit(1596944388.150:53171): proctitle=646F636B65722D686F73742D6578656300376364643838376666346635663230313335323137383262376633666534396438646536613166623132613230313938623064356435373235633462363832310066616C7365
type=SYSCALL msg=audit(1596944388.150:53172): arch=c000003e syscall=257 success=yes exit=7 a0=ffffffffffffff9c a1=c0004e02a0 a2=80242 a3=1b6 items=2 ppid=54667 pid=37770 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="exe" exe="/usr/local/bin/pouchd" key="whatsgoingon"
```

这里有一个问题，可以看到不断有新的线程创建文件，例如上面的 `ppid=54667 pid=37768`

* 抓到了：

```
#ps aux | grep 54667
root      34505  0.0  0.0 112708  2312 pts/0    S+   11:44   0:00 grep --color=auto 54667
root      54667 57.4  0.0 9538840 164552 ?      Ssl  Aug04 3852:09 /usr/local/bin/pouchd
```

原来进程是 pouchd

* audit对系统有影响，所以排查完问题之后我们可能需要停止audit

奇怪，无法停止auditd

```
#systemctl stop auditd
Failed to stop auditd.service: Operation refused, unit auditd.service may be requested by dependency only (it is configured to refuse manual start/stop).
See system logs and 'systemctl status auditd.service' for details.
```

这个问题在 [Prevent stop auditd service in Redhat 7](https://stackoverflow.com/questions/38520295/prevent-stop-auditd-service-in-redhat-7) 有解释

```bash
#systemctl cat auditd
# /usr/lib/systemd/system/auditd.service
[Unit]
Description=Security Auditing Service
DefaultDependencies=no
After=local-fs.target systemd-tmpfiles-setup.service
Conflicts=shutdown.target
Before=sysinit.target shutdown.target
RefuseManualStop=yes
ConditionKernelCommandLine=!audit=0

[Service]
ExecStart=/sbin/auditd -n
## To not use augenrules, copy this file to /etc/systemd/system/auditd.service
## and comment/delete the next line and uncomment the auditctl line.
## NOTE: augenrules expect any rules to be added to /etc/audit/rules.d/
ExecStartPost=-/sbin/augenrules --load
#ExecStartPost=-/sbin/auditctl -R /etc/audit/audit.rules
ExecReload=/bin/kill -HUP $MAINPID

[Install]
WantedBy=multi-user.target
```

原来 systemd 配置中禁止手工停止auditd `RefuseManualStop=yes`

所以改为传统命令停止方式

```
service auditd stop
```

## 清理文件

既然我们已经找出了大量生成文件的原因，除了我哦们需要进一步排查pouch大量生成异常文件的原因，我们还需要做文件清理。但是文件系统中大量文件清理是非常消耗资源和缓慢的，我们需要有一个 [高效清理文件系统大量文件的方法](https://github.com/huataihuang/cloud-atlas-draft/tree/d20c619da83648a4cc9d8a2fc1ecb3c2188f538f/os/linux/security/audit/remove_massive_amounts_of_file/README.md)


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://huataihuang.gitbook.io/cloud-atlas-draft/os/linux/security/audit/find_inode_consume.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
