After iOS 18, some new categories of crash exceptions appeared online, such as those related to the sqlite pcache1 module, those related to the photo album PHAsset, those related to various objc_release crashes, etc. These crash scenarios and stacks are all different, but they all share a common feature, that is, they all crash due to accessing NULL or NULL addresses with a certain offset. According to the analysis, the direct cause is that a certain pointer, which previously pointed to valid memory content, has now become pointing to 0 incorrectly and mysteriously.
We tried various methods to eliminate issues such as multi-threading problems. To determine the cause of the problem, we have a simulated malloc guard detection in production. The principle is very simple:
- Create some private NSString objects with random lengths, but ensure that they exceed the size of one memory physical page.
- Set the first page of memory for these objects to read-only (aligning the object address with the memory page).
- After a random period of time (3s - 10s), reset the memory of these objects to read/write and immediately release these objects. Then repeat the operation starting from step 1.
In this way, if an abnormal write operation is performed on the memory of these objects, it will trigger a read-only exception crash and report the exception stack.
Surprisingly, after the malloc guard detection was implemented, some crashes occurred online. However, the crashes were not caused by any abnormal rewriting of read-only memory. Instead, they occurred when the NSString objects were released as mentioned earlier, and the pointers pointed to contents of 0.
Therefore, we have added object memory content printing after object generation, before and after setting to read-only, and before and after reverting to read-write. The result was once again unexpected. The log showed that the isa pointer of the object became 0 after setting to read-only and before re-setting to read-write.
So why did it become 0 during read-only mode, but no crash occurred due to the read-only status?
We have revised the plan again. We have added a test group, in which after the object is created, we will mlock the memory of the object, and then munlock it again before the object is released. As a result, the test analysis showed that the test group did not experience a crash, while the crashes occurred entirely in the control group.
In this way, we can prove that the problem occurs at the system level and is related to the virtual memory function of the operating system. It is possible that inactive memory pages are compressed and then cleared to zero, and subsequent decompression fails. This results in the accidental zeroing out of the memory data.
As mentioned at the beginning, althougth this issue is a very rare occurrence, but it exists in various scenarios. definitely It appeared after iOS 18. We hope that the authorities will pay attention to this issue and fix it in future versions.