2016-07-22

WindowServer: The privilege chameleon on macOS (Part 1)

When talking about Apple Graphics, the WindowServer component should not be neglected. Rencently KeenLab has been talking about Apple graphics IOKit components at POC 2015 “OS X Kernel is As Strong as its Weakest Part“, CanSecWest 2016 “Don’t Trust Your Eye: Apple Graphics Is Compromised!“, and RECon 2016 “Shooting the OS X El Capitan Kernel Like a Sniper“, however the userland part is seldomly mentioned in public.

This week Pwnie announced bug nominations for 2016, where the windowserver bug CVE-2016-1804 is listed , it made me think of writing something. But when I started writing, I realized it is a long story. Then I realized a long story can be cut into short stories (I also realized my IQ is low recently which many of my colleagues have pointed out, due to extremely hot weather in Shanghai maybe, or not…)

So…I decided to split the whole story into 3. In part 1, I will mainly focus on the history of windowserver, basic concepts, architecture, CVE-2014-1314 (A design flaw which we used to take down OS X Mavericks at Pwn2Own 2014) and finally, details of the pwnie nomination bug: CVE-2016-1804, which we used to take down the latest OS X El Capitan remotely with a browser exploit and escalated to root privilege. However when I first discovered CVE-2016-1804 last year, it had been considered unexploitable, at least for 1 week. Part 1 then wrapped up here with questions/challenges.

Next week I will release part 2 for the partial exploitation by introducing an 0day which gave me inspiration of the successful exploitation of CVE-2016-1804. The last part: part 3, which is the most exciting part, is NOT a blog post, instead it will be discussed at Black Hat 2016 Briefings “SUBVERTING APPLE GRAPHICS: PRACTICAL APPROACHES TO REMOTELY GAINING ROOT“

Ok, now let’s start the short story:

0x1 Introduction

Apple Graphics is one of the most complex components in Apple world (OS X and iOS). It mainly contains the following two parts:
– Userland part
– Kernel IOKit drivers
OS X and iOS have similar graphics architecture. The userland graphics of OS X is mainly handled by “WindowServer” process while on iOS it is “SpringBoard/backboardd” process. The userland graphics combined with the kernel graphics drivers are considered as counterpart of “win32k.sys” on Windows, although the architecture is a little diferent between each other. The userland part of Apple graphics is handled in a separate process while Windows provides with a set of GDI32 APIs which calls the kernel “win32k.sys” directly. Apple’s approach is more secure from the architecture’s perspective as the userland virtual memory is not shared between processes, which increase the exploitation difficulty especially when SMEP/SMAP is not enforced.

0x2 WindowServer Overview

The WindowServer process mainly contains two private framework: CoreGraphics and QuartzCore, each running under a separate thread. Each framework contains two sets of APIs:
– Client side API: Functions starting with “CGS” (CoreGraphics) or “CAS” (QuartzCore)
e.g

void __fastcall _CGSGetWindowShape(mach_port_t a1, int a2, _QWORD *a3, _DWORD *a4)
{
...
}

– Server side API: Functions starting with “__X” (e.g __XCreateSession)
e.g.

__int64 __fastcall _XGetWindowShape(_DWORD *a1, __int64 a2)
{
...
}

The client side API can be called from any client processes. Client APIs are implemented by obtaining the target mach port, composing a mach message and sending the message by calling mach_msg mach API with specific message IDs and send/receive size. Server side API is called by WindowServer’s specific thread. Both CoreGraphics and QuartzCore threads have dedicated server loop waiting for new client message to reach. Once client message reaches, the dispatcher code intercepts the message and calls the corresponding server API based on the message ID.
Here is a snapshot of WindowServer process:

0x3 Sandbox consideration

Almost every process (including sandboxed applications) can call interfaces in WindowServer process through MIG (Mach Interface Generator) IPC. Browser applications including Safari can directly reach WindowServer interfaces from restrictive sandboxed context. Vulnerabilities in WindowServer process may lead to sandbox escape from a remote browser based drive-by attack. It may also lead to root privilege escalation as the WindowServer process behaves like a privilege chameleon. Safari WebContent process has its own sandbox profile defined in /System/Library/Frameworks/WebKit.framework/Versions/A/Resources/com.apple.WebProcess.sb, WindowServer service API is allowed by the following rule:

(allow mach-lookup
      (global-name "com.apple.windowserver.active")
)

Here it seems the QuartzCore interface is not explicitly defined, so here we focus on CoreGraphics interfaces first.

Three years ago when we decided to explore sandbox escape vulnerabilities on OS X, we picked up attack surfaces which meets the following critiria:

Interfaces which can be reached by browser (Because at that time my IQ was not that low.)
Components which run at weak sandbox profiles or no sandbox
Components which have been lasting for a long time, especially those born before Apple Sandbox was introduced at OS X Leopard. This is typical hacker’s thought which I learned from the ASLR story. When ASLR was first introduced in Windows Vista, a lot of previously useless information leak vulnerabilties becomes vital in breaking ASLR, most of which can be very reliably exploited as they are nothing relating to memory corruption, instead they are just some logic flaws which were never considered flaw before ASLR was born.
etc.

After that, windowserver became one of our key focus on vulnerability discovery work.

0x4 MIG IPC

MIG IPC can be described by the following graph from Google PZ’s team blog:

Source: http://googleprojectzero.blogspot.kr/2014/11/pwn4fun-spring-2014-safari-part-ii.html

Like the IPC on other morden OS, MIG IPC can pass information between processes. Considering the following senario, kernel is involved in the process of the IPC:

When process A wants to pass a pointer to process B
When process A wants to pass a mach port to process B (On Windows, similiar concept is HANDLE)
On the above senario, it is not easy just to pass the value itself between processes, instead kernel needs to map the address or allocate a mach port which represents kernel object for the target process. In Apple world, in the first senario the message is called Out Of Line(OOL) descriptor message and the second is called port descriptor message.
Let’s look at the API mach_msg defined by MIT:
```
mach_msg_return_t   mach_msg
                  (mach_msg_header_t                msg,
                   mach_msg_option_t             option,
                   mach_msg_size_t            send_size,
                   mach_msg_size_t        receive_limit,
                   mach_port_t             receive_name,
                   mach_msg_timeout_t           timeout,
                   mach_port_t                   notify);
```

PARAMETERS
msg
[pointer to in/out structure containing random and reply rights]
A message buffer used by mach_msg both for send and receive. This must be naturally aligned.

The msg parameter is interesting, it starts with mach_msg_header_t structure:

typedef	struct 
{
  mach_msg_bits_t	msgh_bits;
  mach_msg_size_t	msgh_size;
  mach_port_t		msgh_remote_port;
  mach_port_t		msgh_local_port;
  mach_port_name_t	msgh_voucher_port;
  mach_msg_id_t		msgh_id;
} mach_msg_header_t;

Here the highest bit of the 32bit msgh_bits defines whether it is a simple message (0x0) or a complex one (0x1). Simple message means no pointer or port is passed to the target process, and in this case the real message data is just apended right after the mach_msg_header_t.
Complex message can be categorized into 3:

typedef union
{
  mach_msg_port_descriptor_t		port;
  mach_msg_ool_descriptor_t		out_of_line;
  mach_msg_ool_ports_descriptor_t	ool_ports;
  mach_msg_type_descriptor_t		type;
} mach_msg_descriptor_t;

The three types are:

mach_msg_port_descriptor_t: a port descriptor message
mach_msg_ool_descriptor_t: OOL descriptor message

mach_msg_ool_ports_descriptor_t: OOL port descriptor (Pointer pointing to a list of mach ports)
The type definition is:

1
2
3

#define MACH_MSG_PORT_DESCRIPTOR 		0
#define MACH_MSG_OOL_DESCRIPTOR  		1
#define MACH_MSG_OOL_PORTS_DESCRIPTOR 		2

So when you want to send a complex message, set the highest bit of the 32bit msgh_bits to 1 in mach_msg_header_t, followed by msgh_descriptor_count indicating the number of complex descriptors in the message:

typedef struct
{
        mach_msg_size_t msgh_descriptor_count;
} mach_msg_body_t;

Then append a list of mach_msg_port_descriptor_t or mach_msg_ool_descriptor_t, or mach_msg_ool_ports_descriptor_t to finish composing the message and send to the target process.

It seems to be useless to put these basic concepts here, but believe me, it is useful (maybe in part 2).

0x5 CoreGraphics Interface

The CoreGraphics interfaces are divided into following categories:
– Workspace
– Window
– Transitions
– Session
– Region
– Surface
– Notifications
– HotKeys
– Display
– Cursor
– Connection
– CIFilter
– Event Tap
– Misc
When sandbox was introduced on Leopard, the first thought to bypass is to do a series of mouse/keyboard simulation operations (For example, to simulate moving to calc icon and double clicking it.) The first trial made me excited because it was quite easy to move the cursor on the window to anywhere from a sandboxed environment by calling _XWarpCursorPosition:

__int64 __fastcall _XWarpCursorPosition(_DWORD *_RDI, __int64 a2)
{
  __int64 *v6; // rdi@3
  signed int v7; // er14@3
  __int64 v13; // r15@7
  __int64 v14; // rax@7
  __int64 result; // rax@14
  int v21; // [rsp+0h] [rbp-20h]@3

  if ( *_RDI < 0 || _RDI[1] != 44 )
  {
    *(_DWORD *)(a2 + 32) = -304;
  }
  ...
    v7 = CGXWarpCursorPosition(0LL);
  ...
  return result;
}

And quickly I located the function to place a double click event: __XPostFilteredEventTapDataSync:

__int64 __fastcall _XPostFilteredEventTapDataSync(_DWORD *a1, __int64 a2)
{
  ...
  else
  {
    *(_DWORD *)(a2 + 32) = post_filtered_event_tap_data(a1[8], a1[9], (unsigned int)a1[10], a1[11], a1 + 13, v3);
  }
  result = *(_QWORD *)NDR_record_ptr;
  *(_QWORD *)(a2 + 24) = *(_QWORD *)NDR_record_ptr;
  return result;
}

However, in post_filtered_event_tap_data, it checks sandbox unfortunately:

__int64 __fastcall post_filtered_event_tap_data(unsigned int a1, unsigned int a2, __int64 a3, unsigned int a4, _DWORD *a5, unsigned int a6)
{
...
  if ( CGXSenderCanSynthesizeEvents() ) //check here
  {
  ...
}

bool CGXSenderCanSynthesizeEvents()
{
  unsigned int v0; // ecx@1
  bool result; // al@2

  v0 = WSGetLastMessageAuditTrailerPid();
  if ( v0 )
    result = (unsigned int)sandbox_check(v0, "hid-control", 0LL) == 0; // failed to pass the check
  else
    result = 0;
  return result;
}

Because most of the sandboxed application won’t have “hid-control” entitlement, my initial trial has to stop here.

Another thought is to add a customized hotkey by calling _XSetHotKey, but also ended up with failure:

__int64 __fastcall _XSetHotKey(__int64 a1, __int64 a2)
{
  ...
  if ( v20 && (unsigned int)sandbox_check(*(unsigned int *)(v20 + 284), "hid-control", 0LL) ) //sandbox check here
            goto LABEL_39;
        }
      }
...
LABEL_39:
    *(_DWORD *)(a2 + 32) = v7;
    goto LABEL_40;
  }
  *(_DWORD *)(a2 + 32) = -304;
LABEL_40:
  result = *(_QWORD *)NDR_record_ptr;
  *(_QWORD *)(a2 + 24) = *(_QWORD *)NDR_record_ptr;
  return result;
}

Actually among the above API set, many interfaces are regarded as “unsafe”, thus sandbox check is performed on those server-side APIs. Typical examples include event tap, hotkey configuration, etc. Because of that, on a sandboxed application, dangerous operations such as adding a hotkey, or post an event tap (e.g sending a mouse clicking event), are strictly forbidden.

On the other side, some interfaces are partially allowed. Typical examples include CIFilter, Window related interfaces, etc. Such interfaces perform operations on specific entities that belong to the caller’s process. For example, API __XMoveWindow performs window move operation. It accepts a user-provided window ID and perform the check by calling connection_holds_rights_on_window function to determine whether the window is allowed to move by caller’s process. Actually only window owner’s process is allowed to do such operations.(or some special entitlement is needed to have the privilege allowing to perform operations on any window):

__int64 __usercall _XMoveWindow@<rax>(__int64 a1@<rax>, _DWORD *a2@<rdi>, __int64 a3@<rsi>, __int128 _XMM0@<xmm0>)
{
 
    if ( (unsigned __int8)connection_holds_rights_on_window(v8, 1LL, v7, 1LL, 1LL) //check window rights of the source process
      || (v9 = 1000, v7)
      && (v10 = (unsigned __int8)connection_holds_rights_on_window(v8, 4LL, v7, 1LL, 1LL) == 0, v9 = 1000, !v10) )
    {
      __asm
      {
        vcvtsi2ss xmm0, xmm0, r12d
        vcvtsi2ss xmm1, xmm0, r13d
      }
      v9 = CGXMoveWindowList(v8, (char *)&v14 + 4, 1LL);
    }
    *(_DWORD *)(a3 + 32) = v9;
  }
  result = *(_QWORD *)NDR_record_ptr;
  *(_QWORD *)(a3 + 24) = *(_QWORD *)NDR_record_ptr;
  return result;
}

At this point, it made me believe Apple has considered everything to make Apple sandbox compatible to those older components.
But luckily I started thinking all of those in the year 2013, which is a pretty good time. I finally found a bug where Apple failed to consider.

0x6 CVE-2014-1314: the old legend

As we know, Apple sandbox was introduced not long time ago, while Apple graphics has a much longer history. The original design of Apple graphics doesn’t take sandbox stuff into account. Although years have been spent to improve the graphics security under the sandboxed context, there are still issues left. CVE-2014-1314 is a typical example, which I used it in Pwn2Own 2014. The issue exists in CoreGraphics session APIs. CoreGraphics provides a client side API CGSCreateSessionWithDataAndOptions which sends request to be handled by server side API _XCreateSession.
_XCreateSession will reach the following code:

__int64 __fastcall __CGSessionLaunchWorkspace_block_invoke(__int64 a1)
{ 
...
v28 = fork(); //fork
if ( v28 == -1 )
{
  v29 = *__error();
CGSLogError("%s: cannot fork workspace (%d)", v37);
v3 = 1011; }
else
{
if ( !v28 )
{
  setgid(HIDWORD(v24));
  setuid(v24); //set uid to current user’s uid
  setsid();
  chdir("/");
  v35 = open("/dev/null", 2, 0LL);
v36 = v35;
if ( v35 != -1 )
{
  dup2(v35, 0);
  dup2(v36, 1);
  dup2(v36, 2);
...
  if ( v36 >= 3 )
    close(v36);
}
execve(v9, v40, v44);
_exit(127);
}

This function allows the user to create a new logon session. By default, WindowServer will create a new process at “/System/Library/CoreServices/loginwindow.app/Contents/MacOS/loginwindow” and launch the login window under the current user’s context (by calling setuid and setgid to the user’s. Oh, WindowServer has can setuid!!). Apple also allows user to specify customized login window, which - on the contrary - allows attackers in the sandboxed context to run any process at an unsandboxed context.

0x7 CVE-2016-1804: the memory corruption

Now let’s back to the year 2016. In CoreGraphics, some new interfaces (We count them as Misc category) were introduced to align with new models of MacBook. For example, interface _XSetGlobalForceConfig allows a user to configure force touch. Users can provide with force touch configuration data and serialize them. _XSetGlobalForceConfig saves the serialized data into CFData and call _mthid_unserializeGestureCon guration API to unserialize the data.

__int64 __fastcall _XSetGlobalForceConfig(__int64 a1, __int64 a2)
{
...
   v5 = *(_QWORD *)(a1 + 28); //v5 is a pointer pointing to user controllable data
   v6 = CFDataCreateWithBytesNoCopy(*(_QWORD *)kCFAllocatorDefault_ptr, 
        v5, 
        v4, 
        *(_QWORD *)kCFAllocatorNull_ptr); // create CFData on v5
  
  v7 = _mthid_unserializeGestureConfiguration(v6); //try to unserialize the data
   if ( v6 )
     CFRelease(v6, v5); //free the CFData twice!
...
}

_mthid_unserializeGestureConfiguration forgets to retain the CFData and calls CFRelease to free the data if the force touch configuration is not valid. After _mthid_unserializeGestureCon guration function returns, _XSetGlobalForceConfig frees the data again and causes the double free.

__int64 __fastcall _mthid_unserializeGestureConfiguration
   (__int64 a1)
{ ...
if ( v2 ) {
   if ( !(unsigned __int8)
_mthid_isGestureConfigurationValid(v2) )
CFRelease(a1); //if the data is invalid, free it once
result = v2; }
}
return result;
}

0x8 Wrap-up: exploitable?

CVE-2016-1804 looks unexploitable because:

small time window between two frees (crash if failure to fill in data in between?)
All CoreGraphics interfaces are running in a server loop on a single thread, not possible to leverage another CoreGraphics API to attempt racing and filling in at another thread.
ASLR/DEP consideration

Here I leave the questions to readers and I will discuss about the exploitation at Part 2 next week.