1 Measuring the Insecurity of Mobile Deep Links of Android Fang Liu, Chun Wang, Andres Pico, Danfeng Yao, Gang Wang Department of Computer Science, Virginia Tech { fbeyond, wchun, andres, danfeng, gangwang } , which launch apps from websites with preloaded context Abstract becomes instrumental to many key user experiences. For Mobile deep links are URIs that point to specific loca- instance, from a restaurant’s home page, users can tap a tions within apps, which are instrumental to web-to-app hyperlink to launch the phone app and call the restaurant, communications. Existing “scheme URLs” are known to or launch Google Maps for navigation. Recently, users have hijacking vulnerabilities where one app can freely can even search in-app content with a web-based search register another app’s schemes to hijack the communi- engine ( , Google) and directly launch the target app e.g. cation. Recently, Android introduced two new meth- by clicking the search result [5]. ods “App links” and “Intent URLs” which were designed The key enabler of web-to-mobile communication is with security features, to replace scheme URLs. While mobile deep links. Like web URLs, mobile deep links the new mechanisms are secure in theory, little is known are universal resource identifiers (URI) for content and about how effective they are in practice. functions within apps [49]. The most widely used deep In this paper, we conduct the first empirical measure- link is scheme URL supported by both Android [7] and ment on various mobile deep links across apps and web- iOS [3] since 2008. If an app wants to be launched from sites. Our analysis is based on the deep links extracted the web, the app can register URI schemes to the mobile from two snapshots of 160,000+ top Android apps from OS during installation. For example, the Facebook app Google Play (2014 and 2016), and 1 million webpages fb://profile ” to open user profiles. Later registers “ from Alexa top domains. We find that the new linking ” is clicked on the fb://profile/user1 when the link “ methods (particularly App links) not only failed to de- web, OS then can direct users to the Facebook app. liver the security benefits as designed, but significantly worsen the situation. First, App links apply link verifica- Despite the conve- Threats to Mobile Deep Links. tion to prevent hijacking. However, only 194 apps (2.2% nience, researchers have identified serious security vul- out of 8,878 apps with App links) can pass the verifica- nerabilities in scheme URLs [18, 19, 55]. The most sig- tion due to incorrect (or no) implementations. Second, nificant one is link hijacking , where one app can register we identify a new vulnerability in App link’s preference another app’s scheme and induce the mobile OS to open setting, which allows a malicious app to intercept arbi- the wrong app. Fundamentally, link hijacking is possi- trary HTTPS URLs in the browser without raising any ble because there is no restriction on what schemes apps alerts. Third, we identify more hijacking cases on App ” to hijack fb can register. A malicious app may register “ links than existing scheme URLs among both apps and the deep link request to the Facebook app to launch it- websites. Many of them are targeting popular sites such self. This allows the malicious apps to perform phishing as online social networks. Finally, Intent URLs have lit- e.g. attacks ( , displaying a fake Facebook login box) or tle impact in mitigating hijacking risks due to a low adop- , PII) [19, 35]. e.g. steal sensitive data carried by the link ( tion rate on the web. Even though Android and iOS may prompt users be- fore launching an app, there are many cases where such prompting is skipped without user knowledge. 1 Introduction Recently, two new deep link mechanisms were pro- posed to address the security risks in scheme URLs: With the wide adoption of smartphones, mobile websites [6, 9] was in- App link and Intent URL. 1) App Link and native apps have become the two primary interfaces troduced to Android and iOS in 2015. It no longer al- to access online content [10, 44]. Today, a user can easily

2 lows developers to customize schemes, but exclusively to server misconfigurations, including popular apps such uses HTTP/HTTPS scheme. To prevent hijacking, App as Airbnb. links introduced a way to verify the app-to-link associa- Second, we uncover a new vulnerability in App tion. More specifically, mobile OS verifies a registered links, which allows malicious apps to stealthily intercept e.g. link ( ) by con- , HTTP/HTTPS URLs in the browser. The root cause is ) for tacting the corresponding web host ( that Android grants excessive permissions to unverified verification. This prevents apps other than Facebook to App links through the preference setting. For an unver- claim this link. 2) Intent URL [2] is another solution ified App link, Android by default will prompt users to introduced in 2013, which only works on Android. In- choose between the app and the browser. To disable pro- tent URL defines how deep links should be called by moting, users may set a “preference” to always use the ”, Intent websites. Instead of calling “ fb://profile app for this link. This preference is overly permissive, URL explicitly specifies the destination app identifier since it not only disables prompting for the current link, , package name) in the parameter to avoid confusion. i.e. ( but all other unverified links registered by the app. A malicious app, once received preference, can hijack any Measurements. While most existing works focus e.g. sensitive HTTP/HTTPS URLs ( , to a bank website) on vulnerabilities in scheme URLs [18, 19, 55], little is without alerting users. We validate this vulnerability in known about how widely App links and Intent URLs the latest Android 7.1.1. are adopted, and how effective they are in mitigating , We detect more malicious hijacking attacks Third the threat in practice. In this paper, we conduct the on App links (1,593 apps) than scheme URLs (893 first large-scale measurement on the current ecosystem of e.g. apps). Case studies show that popular websites ( , mobile deep links. Our goal is to detect and measure link “ , Facebook) are common e.g. ”) and apps ( hijacking vulnerabilities across the web and mobile apps, targets for traffic hijacking. In addition, we identify sus- and understand the effectiveness of new linking mecha- picious apps that act as the man-in-the-middle between nisms in battling hijacking attacks. websites and the original app to record sensitive URLs ”). and the parameters ( , “ e.g. We perform extensive measurements on a large col- Finally , Intent URLs have very limited impact in miti- lection of mobile apps and websites. To measure the gating hijacking risks due to the low adoption rate among adoption of different mobile deep links, we collected two websites. Only 452 websites out of the Alexa top 1 mil- snapshots of 160,000+ most popular Android apps from lion contain Intent URLs (0.05%), which is a much lower Google Play in 2014 and 2016, and crawled 1 million ratio than that of App links (48.0%) and scheme URLs web pages (using a dynamic crawler) from Alexa top do- (19.7%). Meanwhile, among these websites, App links mains. We primarily focus on Android for its significant drastically increase the number of links that have hijack- market share (87%) [29] and availability of apps. We ing risks compared to existing vulnerable scheme URLs also perform a subset of analysis on iOS deep links. At To the best of our knowledge, our study is the first the high-level, our method is to extract the link regis- empirical measurement on the ecosystem of mobile deep tration entries (URIs) from apps, and then measure their links across web and apps. We find the new linking meth- empirical usage on websites. To detect hijacking attacks, ods not only fail to deliver the security benefits as de- we group apps that register the same URIs as link colli- signed, but significantly worsen the situation. There is a sion groups. We find that not all link collisions are ma- clear mismatch between the security design and practical licious — certain links are expected to be shared such implementations due to the lack of incentives of develop- as links for common functionality ( e.g. , “ tel ”) or third- ers, developer mistakes, and inherent vulnerabilities in zxing , “ e.g. party libraries ( ”). We develop methods to the link mechanism. Moving forward, we propose a list identify malicious hijacking attempts. of suggestions to mitigate the threat. We have reported Findings. Our study has four surprising findings, the over-permission vulnerability to the Google Android which lead to one overall conclusion: the newly intro- team. The detailed plan for further notification and risk duced deep link solutions not only fail to improve secu- mitigation is described in 8. § rity, but significantly increase hijacking risks for users. App links’ verification mechanism fails in prac- First, 2 Background and Research Goals tice. Surprisingly, among 8,878 Android apps with App links, only 194 (2.2%) correctly implement link verifica- Mobile deep links are URIs that point to specific loca- tion. The reasons are a combination of the lack of mo- tions within mobile apps. Through deep links, websites tivation from app developers and various developer mis- can initiate useful interactions with apps, which is instru- takes. We confirm a subset of mistakes in iOS App links mental to many key user experiences, for example, open- too: 1,925 out of 12,570 (15%) fail the verification due ing apps, sharing and bookmarking in-app pages [49],

3 Mobile Phone Mobile Phone Browser ⁄ Webview 2 Get Mobile OS foo://p App Scheme URL: I mplicit int 4 3 Return assetlinks.json Verify bar e 1 nt Register App Link: I mp lici t int* assetlinks.json e n t Intent URL: App app: foo App intent://p#Intent;scheme= Explicit intent foo* foo foo;;end Figure 1: Three types of mobile deep links: Scheme Figure 3: App link verification process. URL, App Link and Intent URL. and iOS 2.0 [3] in 2008. Figure 2 shows the syntax Scheme URL: fb://prole/1234 of a scheme URL. App developers can customize any scheme host path schemes and URIs for their app without any restriction. App Link: Prior research has pointed out key security risks in scheme host path scheme URLs [19, 55], given that any app can register Figure 2: URI syntax for Scheme URLs and App links. other apps’ schemes. For example, apps other than Face- book can also register “ fb:// ”. When a deep link is clicked, it triggers an “implicit intent” to open any app and searching in-app content using search engines [5]. In with a matched URI. This allows a malicious app to hi- the following, we briefly introduce how deep links work jack the request to the Facebook app to launch itself, ei- and the related security vulnerabilities. Then we describe , displaying a fake Facebook login ther for phishing ( e.g. our research goals and methodology. box), or stealing sensitive data in the request [19, 35]. With an awareness of this risk, Android lets users be 2.1 Mobile Deep Links the security guard. When multiple apps declare the same URI, users will be prompted (with a dialog box) to se- To understand how deep links work, we first introduce lect/confirm their intended app. However, if the mali- inter-app communications on Android. An Android app cious app is installed but the victim app is not, the mali- . One is essentially a package of software components cious app will automatically skip the prompting and hi- app’s components can communicate with another app’s jack the link without user knowledge. Even when both components through Intent , a messaging object charac- apps are installed, the malicious app may trick users to terized “action”, “category” and “data”. By sending an set itself as the “preference” and disable prompting. His- intent, one app can communicate with the other app’s torically speaking, relying on end-users as the sole secu- front-end , Services Content Activities , or background rity defense is risky since users often fail to perceive the and . Broadcast Receivers Providers nature of an attack, leading to bad decisions [12, 22, 53]. Mobile deep links trigger a particular type of intent to enable communications between the web and mobile App Link was introduced Solution1: App Link. apps. As shown in Figure 1, after users click on a link recently in October 2015 to Android 6.0 [6] as a more in the browser (or in-app WebView), the browser sends secure version of deep links. It was designed to pre- an intent to invoke the corresponding component in the vent hijacking with two mechanisms. First, the authen- target app. Unlike app-to-app communication, mobile tic app can build an association with the correspond- deep link can only launch front-end Activity in the app. ing website, which allows the mobile OS to open the Mobile deep links work in two simple steps: 1) Reg- App link exclusively using the authentic app. Second, ” should first register its URIs foo istration: an app “ App link no longer allows developers to customize their ”) to the mobile OS foo:// ” or “ (“ https or http own schemes, but exclusively uses the during installation. The URIs are declared in the in scheme. the “data” field of intent filters . 2) Addressing: when Figure 3 shows the App link association process. Sup- foo:// “ ” is clicked, mobile OS will search all the intent* ”. pose app “ ” wants to register “ foo filters for a potential match. Since the link matches the Mobile OS will contact the server at “ ” for ver- ”, mobile OS will launch this app. foo URI of app “ ification. The app’s developer needs to set up an associ- ation file “assetlinks.json” beforehand under the root di- rectory (“/.well-known/”) of the server. This 2.2 Security Risks of Deep Linking file must be hosted on an HTTPS server. If the file Scheme URL is Hijacking Risk in Scheme URL. contains an entry that certifies that app “ foo ” is asso- the first generation of mobile deep links, and is the least ciated with the link “* ”, the mobile secure one. It was introduced since Android 1.0 [7] OS will confirm the association. The association file

4 Link Conditions Prompt ”, contains a field called “ fingerprints cert sha256 1 User? Link Type > Set As which is the SHA256 fingerprint of the associated app’s Preference Apps Verified signing certificate. The mobile OS is able to verify 7 3 3 / the fingerprint and prevent hijacking because only the 7 Scheme 3 3 / authentic app has the corresponding signing certificate. URL 7 7 7 / ” also wants to register Suppose a malicious app “ bar 7 / 3 7 ”, the verification will fail, assum-* “ 3 7 7 / ing the attacker cannot access the root of server App / 3 7 7 ∗ to modify the association file and the fingerprint. Link 3 7 7 / The iOS version of App links is called universal link, / 3 7 3 introduced at iOS 9.0 [9], which has the same verifica- / / Intent URL 7 / tion process. The association file for iOS is “apple-app- Table 1: Conditions for whether users will be prompted site-association”. However, iOS and Android have dif- ∗ App Links always after clicking a deep link on Android. failed verifications ferent policies to handle . iOS pro- have at least one matched app, the mobile browser. hibits opening unverified universal links in apps. An- droid, however, leaves the decision to users: if an unver- when a malicious app registers the URI that belongs to ified link is clicked, Android prompts users to choose if the victim app. If mobile OS redirects the user to the they want to open the link in the app or the browser. malicious app, it can lead to phishing ( , the malicious e.g. Intent URL was intro- Solution 2: Intent URL. app displays forged UI to lure user passwords) or data duced in 2013 and only works on Android [2]. Intent leakage ( e.g. , the deep link may carry sensitive data in the URLs prevent hijacking by changing how the deep link URL parameters such as PII and session IDs) [19, 35]. In is called on the website. As shown in Figure 1, in- this threat model, mobile OS and browser (or WebView) ”, Intent URL is structured as foo://p stead of calling “ are not the targets of the attack, and we assume they are “ intent://p/#Intent;scheme=foo;package=com not malicious. .foo;end ” where the package name of the target app is Users also play a role in this The Role of Users. explicitly specified. Package name is a unique identifier threat model. After clicking on a deep link, a user may for an Android app. Clicking an intent URL will launch be prompted with a dialog box to confirm the destination an “explicit intent” to open the specified app. app. As shown in Table 1, prompting can be skipped in Compared to scheme URLs and App links, Intent URL scheme URLs many cases. For , a malicious app can skip does not need special URI registration on the app. Intent prompting if the victim app is not installed, or by trick- URL can invoke the same interfaces defined by the URIs ing users to set the malicious app as the “preference”. of scheme URLs or App links, as well as other exposed can skip prompting if the link has been verified. App link components [2]. Otherwise, users will be prompted to choose between the Intent URLs browser and the app. will not prompt users at all since the target app is explicitly specified. 2.3 Research Questions Our study seeks to answer key ques- Our Goals. While the hijacking risk of scheme URLs has been re- tions regarding how mobile deep links are implemented ported by existing research [18, 19, 55], little is known in the wild and their security impact. We ask three sets of about how prevalently this risk exists among apps, and , how prevalently are different deep links First questions. how effective the new mechanisms (App links and Intent adopted among apps over time? Are App links and Intent URLs) are in reducing this risk in practice. We hypothe- Second URLs implemented properly as designed? , how size that upgrading from scheme URL to App link/Intent many apps are still vulnerable to hijacking attacks? How URL is a non-trivial task, considering that scheme URLs many vulnerable apps are exploited by other real-world may already have significant footprints on the web. Mo- apps? Third , how widely are hijacked links distributed bile platforms might be able to enforce changes to apps among websites? How much do App links and Intent through OS updates, but their influence on the web is URLs contribute to mitigating such links? likely less significant. In this paper, we conduct the first To answer these questions, we first describe data col- large-scale measurement on the mobile deep link ecosys- lection ( 3), and measure the adoption of App links and § tem to understand the adoption of different linking meth- § 4). We perform extensive scheme URLs among apps ( ods and their effectiveness in battling hijacking threats. security analyses to understand how effective App links can prevent hijacking ( § 5), and then describe the method Our study focuses on Threat Model. link hijack- 6). Finally, we § to detect hijacking attacks among apps ( ing threat since this is the security issue that App Links move to the web to measure the usage of Intent URLs, and Intent URLs aim to address. Link hijacking happens

5 and the prevalence of hijacked links ( § 8, we sum- 7). In § websites to decide whether to use Intent URLs or scheme marize key implications and discuss possible solutions. URLs to launch the app. We will examine the adoption 7). § Intent URLs later by analyzing web pages ( We provide an overview of deep link adoption by an- 3 Datasets alyzing 1) how widely the scheme URLs are adopted among apps, and 2) whether App links are in the process We collected data from both mobile apps and websites, of replacing scheme URLs for better security. including two snapshots of 160,000+ most popular An- droid apps in 2014 and 2016, and web pages from Alexa top 1 million domains. 4.1 Extracting URI Registration Entries Mobile Apps. To examine deep link registration, Android apps register their URIs in the manifest file we crawled two snapshots of mobile apps from Google ( Both Scheme URLs and AndroidManifest.xml ). Play. The first snapshot App2014 contains 164,322 most App Links are declared in Intent filters as a set popular free apps from 25 categories in December 2014 of matching rules, which can either be actual links (crawled with an Android 4.0.1 client). In August 2016, ) or a wild card ( fb://login/ ( ). fb://profile/* we crawled a second snapshot of top 160,000 free apps Since there is no way to exhaustively obtain all links be- using an Android 6.0.1 client. We find that 48,923 apps hind a wild card, we treat each matching rule as a regis- in App2014 are no longer listed on the market in 2016. tration entry. Given a manifest file, we extract deep link 4,963 apps in 2014 snapshot fell out of the top 160K list entries in three steps: in 2016. To match the two datasets, we also crawled these 4,963 apps in 2016, forming an App2016 dataset of Step1: Detecting Open Interfaces. We capture all • 164,963 apps. The two snapshots have 115,399 overlap- the Activity intent filters whose “category” field con- ping apps. For each app in App2016 , we also obtained DEFAULT . This returns and BROWSABLE tains both the developer information, downloading count, review all the components that are reachable from the web. count and rating. Step2: Extracting App Link. Among intent fil- • Our app dataset is biased towards popular apps among ters in Step 1, we capture those whose “action” con- the 2.2 million apps in Google Play [48]. Since these . This returns intent filters with either App tains VIEW popular apps have more downloads, potential vulnerabil- 1 Links or Scheme URLs in their “data” fields . We ex- ities could affect more users. Our result can serve as a tract App Link URIs as those with http/https scheme. lower bound of empirical risks. Note that App Link intent filters have a special field . If its value is TRUE, then mobile autoVerify called Alexa Top 1 Million Websites. To understand deep OS will perform verification on the App link. link usage on the web, we crawled Alexa top 1 million domains [1] in October 2016. We simulate using an An- All the non- • Step3: Extracting Scheme URL. droid browser (Android 6.0.1, Chrome/41/0/2272.96) to http/https URIs from Step2 are Scheme URLs. visit these web domains and load both static HTML page We apply the above method to our dataset and the re- (index page) and the dynamic content from JavaScript. sult is summarized in Table 2. Among the 160K apps in This is done using modified OpenWPM [25], a head- , we find that 20.3K apps adopt scheme URLs App2016 less browser-based crawler. For each visit, the crawler and 8.9K apps adopt App links. Note that for the apps in loads the web page and waits for 300 seconds allowing (Android 4.0 or lower), App Link had not been App2014 the page to load the dynamic content, or perform the redi- introduced to Android yet. We find that 4,545 apps in rection. We store the final URL and HTML content. This register http/https URIs, which are essentially App2014 crawling is also biased towards popular websites, assum- http scheme URLs with “ ” or “ https ” as the scheme. ing that deep links on these sites are more likely to be For consistency, we still call these http/https links as App encountered by users. We refer this dataset as Alexa1M . links, but link verification is not supported for these apps. 4 Deep Link Registration by Apps 4.2 Scheme URL vs. App Link In this section, we start by analyzing mobile apps to un- Next, we compare the adoption of Scheme URLs and derstand deep link registration and adoption. In order to App links across time, app categories and app popular- receive deep link requests, an app needs to register its ity. We seek to understand if the new App links are on URIs to mobile OS during installation. Our analysis in the way of replacing Scheme URLs. this section focuses on Scheme URLs and App links. For 1 2, developers do not need § Intent URLs, as described in The rest intent filters whose “action” is not VIEW can still be trig- gered by Intent URLs. special registrations in the app. Instead, it is up to the

6 Dataset Apps accept Unique Total Apps accept Apps accept Unique Schemes either Links Web Hosts Apps App Links Scheme URLs 12,428 (7.6%) App2014 6,471 164,322 10,565 (6.4%) 4,545 (2.8%) 8,845 App2016 23,830 (14.5%) 18,839 18,561 164,963 20,257 (12.3%) 8,878 (5.4%) Table 2: Two snapshots of Android apps collected in 2014 and 2016. 115,399 apps appear in the both datasets; 48,923 apps in App2014 are no longer listed on the market in 2016; App2016 has 49,564 new apps. 100 35 App Categories. Among the 25 app cat- App Links 30 Scheme URLs 80 we find that the following categories egories, 25 60 20 SHOP- have the highest deep link adoption rate: 15 40 PING SOCIAL (25.5%), (23.4%), LIFESTYLE 10 20 Host 5 CDF of Apps (%) Scheme MAGAZINES (20.5%) and AND NEWS (21.0%), 0 0 Apps w/ Deep Links (%) [1K, 1M) [1M, ∞ ) 6 4 9 8 7 5 10 -5 -4 [0, 1K) -2 -1 0 1 2 3 -3 These apps are TRAVEL AND LOCAL (20.2%). # of New Schemes/Hosts per App Download Count content-heavy and often handle user personally identifi- able information ( e.g. , social network app) and financial Figure 5: % of apps w/deep Figure 4: # of new schemes data ( , shopping app). Link hijacking targeting these e.g. links; apps are divided by and app link hosts per app apps could have practical consequences. between 2014 and 2016. download count. Adoption over Time. As shown in Table 2, there 5 Security Analysis of App Links are significantly more apps that started to adopt deep links from 2014 to 2016 (about 100% growth). However, Our result shows that App links are still not as popular the growth rates are almost the same for App links and as scheme URLs. Then for apps that adopt App links, Scheme URLs. There are still 2-3 times more apps using are they truly secure against link hijacking? As we dis- scheme URLs than those with App links. Apps links are cussed in 2.2, App link was designed to prevent hijack- § far from replacing scheme URLs. ing through a link verification process. If a user clicks Figure 4 specifically looks at apps in both snapshots. on an unverified App link, the mobile OS will prompt We select those that adopt either type of deep links in the user to choose whether he/she would like to open either snapshot (13,538 apps), and compute the differ- the link in the browser or using the app. In the fol- ences in their number of schemes/hosts between 2014 lowing, we empirically analyze the security properties of and 2016. We find that the majority of apps (over 96.2%) App links in two aspects. First, we measure how likely either added more deep links or remained the same. Al- app developers make mistakes when deploying App link most no apps removed or replaced scheme URLs with verification. Second, we discuss a new vulnerability App links. The conclusion is the same when we compare we discovered which allows malicious apps to skip user the number of URI rules (omitted for brevity). This sug- prompting when unverified App links are clicked. Ma- gests that scheme URLs are still heavily used, exposing licious apps can exploit this to stealthily hijack arbitrary users to potential hijacking threat. HTTP/HTTPS URLs in the mobile browser without user App Popularity. We find that deep links are knowledge. more commonly used by popular apps (based on down- load count). In Figure 5, we divide apps in 2016 into , three buckets based on their download count: [ 0 , 1 K ) 5.1 App Link Verification M ) [ 1 K , 1 , [ 1 M . Each has 20,654, 127,323 and 5,223 ) ∞ , apps respectively. Then we calculate the percentage of link verification truly We start by examining whether apps that adopt deep links in each bucket. We observe protects apps from hijacking attacks. Since App link has that 33% of the 5,223 most popular apps adopt scheme , all the http/https links App2014 not been introduced for URL, and the adoption rate goes down to 8% for apps in 2014 were unverified. In the following, we focus on 1K downloads. The trend is similar for App links. < with apps in App2016 . In total, there are 8,878 apps that regis- deep links have aver- with In addition, we find that apps ter App links, involving 18,561 unique web domains. We crawled two snapshots of the association files for each agely 4 million downloads per app, which is orders of without magnitude higher than apps deep links (125K domain in January and May of 2017 respectively. We downloads per app). As deep links are associated with use the January snapshot to discuss our key findings, and popular apps, potential vulnerabilities can affect many then use the May snapshot to check if the identified prob- users. lems have been fixed.

7 ∗ Date Apps Verif. Apps Apps w/ Apps with Failed Verifications Verified App Links Host w/o App Host w/ Turned On Wrong Host Assoc. Host Invalid F. Misconfig. Other apps HTTP Assoc. F. Path 415 194 26 177 Jan.17 0 10 60 8,878 11 8,878 192 26 171 8 0 415 57 May.17 18 Table 3: App Link verification statistics and common mistakes (App2016) based on data from January 2017 and May ∗ 2017. One app can make multiple mistakes. Date Hosts w/ Assoc. F. Under HTTP Wrong Path Invalid File Type Jan.17 0 (0%) 108 (1%) 12,570 1,817 (14%) iOS 1,820 (13%) 113 (.8%) 13,541 May.17 0 (0%) Jan.17 330 (18%) 4 (.2%) 81 (4%) 1,833 Android 2,779 474 (17%) 0 (0%) 118 (4%) May.17 Table 4: Association files for iOS and Android obtained after scanning 1,012,844 domains. more apps with an invalid association files in May com- Failed Verifications. As of January 2017, we find pared to that of January. Manual examination shows that a surprisingly low ratio of verified App links. Among new mistakes are introduced when the developers update 8,878 apps that register App Links, only 194 apps suc- cessfully pass the verification (2%). More specifically, the association files. ” field as TRUE, autoVerify only 415 apps (4.7%) set the “ Misconfigurations for iOS and Android. To show which triggers the verification process during app instal- that App links verification can be easily misconfigured, lation. This means the vast majority of apps (8,463, we put together 1,012,844 web domains to scan their as- 95.3%) do not even start the verification process. Inter- sociation files. These 1,012,844 domains is a union of estingly, 434 apps actually have the association file ready Alexa top 1 million domains and the 18,561 domains ex- on their web servers, but the developers seem to forget to tracted from our apps. We scan the association files for configure the apps to turn on the verification. both Android and iOS. Even for apps that turn on the verification, only 194 As of January 2017, 12,570 domains (out 1 million) out of 415 can successfully complete the process as of have iOS association files and only 1,833 domains have January 2017. Table 3 shows the common mistakes of Android association files (Table 4). It is unlikely that the failed apps (one app can have multiple mistakes). there are 10x more iOS-exclusive apps. A more plau- More specifically, 26 apps incorrectly set the App link sible explanation is iOS developers are more motivated ( e.g. , with a wildcard in the domain name), which is im- to perform link verification, since iOS prohibits opening possible for mobile OS to connect to. On the server-side, unverified HTTP/HTTPS links in apps. In contrary, An- 177 apps turn on the verification, but the destination do- droid leaves the decision to users by prompting users to main does not host the association file; 11 apps host the choose between using apps or a browser. file under an HTTP server instead of the required HTTPS We find iOS apps also have significant mis- server; 10 apps’ files are in invalid JSON format; 60 configurations. This analysis only covers a subset of pos- apps’ association files do not contain the App link (or sible mistakes compared to Table 3, but still returns a the app) to be verified. Note that for these failed apps, large number. As of January 2017, 1817 domains (14%) we do not distinguish whether they are malicious apps are hosting the association file under HTTP, and there attempting to verify with a domain they do not own, or are additional 108 domains (1%) with invalid JSON files. simply mistakes by legitimate developers. One example is the Airbnb’s iOS app. The app tries to We confirm all these mistakes lead to failed verifica- ”, which only hosts the associate with “ tions by installing and testing related apps on a phys- association file under an HTTP server. This means users ical phone. We observe many of these mistakes are will not be able to open this link in the Airbnb app. made by popular apps from big companies. For ex- In May 2017, we scan these domains again. We ob- ample, “ ” is Amazon’s official music serve 7.7% of increase of hosts with association files for app, which claims to be associated with “ ”. iOS and 51.6% increase for Android. However, the num- However, the association file under does ber of misconfigured association files also increased. not certify this app. We tested the app on our phone, which indeed failed the verification. 5.2 Over-Permission Vulnerability In May 2017, we check all the apps again and find that In addition to verification failures, we identify a new vul- most of the identified problems remain unfixed. More- nerability in the setting preferences for App links. Recall over, some apps introduce new mistakes: there are 8

8 that unverified App links still have one last security de- Discussion. Fundamentally, this vulnerability is fense — the end user. Android OS prompts users when caused by the excessive permission to unverified App unverified App links are clicked, and users can choose links. When setting preferences, the permission is not between a browser and the matched app. We describe an scheme-level , but to the link-level . We sus- applied to the over-permission vulnerability that allows malicious apps pect that the preference system of App links is directly to skip prompting for stealthy hijacking. inherent from scheme URLs. For scheme URLs, the preference is also set to the scheme level which makes Over-Permission through Preference Setting. User more sense ( e.g. , allowing the Facebook app to open all prompting is there for better security, but prompting fb:// “ ”). However, for App links, scheme-level permis- users too much can hurt usability. Android’s solution sion means attackers can hijack any HTTP/HTTPS links. is to take a middle ground using “preference” setting. To successfully exploit this vulnerability, a malicious When an App link is clicked, users can set “preference” , using e.g. app needs to trick users to set the preference ( for always opening the link in the native app without benign functionalities). For example, an attacker may prompting again. We find that the preference setting design a recipe app that allows users to open recipe web gives excessive permissions. Specifically, the preference links in the app for an easy display and sharing. This not only disables the prompting for the current link that recipe app can ask users to set the preference for opening the user sees, but all other (unverified) HTTP/HTTPS recipe links but secretly registers an online bank’s App links that this app register. For example, if the user sets links to receive the same preference. We have filed a bug ”, all the links with preference for “ report through Google’s Vulnerability Reward Program https:// “ ” in this app receive the permission. Exploit- (VRP) in February 2017. We are currently working with ing this vulnerability allows malicious apps to hijack any the VRP team to mitigate the threat. HTTP/HTTPS URLs without alerting users. iOS has a similar preference setting, but not vulnera- ble to this over-permission attack. In iOS, if the user sets ” is a Proof-of-Concept Attack. bar Suppose “ preference for one app to open an HTTPS link. The per- malicious app that register both “ ” has suc- mission goes to all the HTTPS links that the app The user* and “ ”. . The Android vulnerability is caused by cessfully verified sets preference for using “ bar ” to open the link the fact that permission goes to unverified links. ”, which is a normal action. Then “ without user knowledge, the permission also applies to 5.3 Summary of Vulnerable Apps “ ”.* Later, suppose visits user her this bank’s Thus far, our analysis shows that most apps are still vul- trans- and mobile a in website browser, nerable to link hijacking. First, scheme URLs are still fers money through an HTTPS request heavily used among apps. Second, for apps that adopt “ App links, only 2% can pass the link verification. The amount=1000& recipient=tom ”. Because of the over-permission vulnerability described above makes the preference setting, this request will automatically trigger situation even worse. In 2016, out of all 23,830 apps that bar without prompting the user. The browser wraps up adopt deep links, 23,636 apps either use scheme URLs this URL and the parameters in plaintext to create an or unverified App links. These are candidates of poten- Intent, and hands it over to the app bar . bar can then tial hijacking attacks. change the recipient and use the session ID to transfer money to the attacker. In this example, the attacker 6 Link Hijacking sets the path of the URI as “ bar ” so that /transfer/* would only be triggered during money transfer. The While many apps are vulnerable in theory, the real ques- app can make this even stealthier by quickly terminating tion is how many vulnerable apps are exploited in prac- itself after the hijacking, and bouncing the user back to tice? For a given app, how likely would other apps regis- the bank website in the browser. ter the same URIs ( a.k.a. , link collision)? Do link colli- sions always have a malicious intention? If not, how can We validate this vulnerability in both Android 6.0.1 we classify malicious hijacking from benign collisions? and 7.1.1 (the latest version). We implement the proof- of-concept attack by writing a malicious Android app to To answer these questions, we first measure how likely hijack the author’s own blog website (instead of an actual it is for different apps to register the same URIs. Our bank). The attack is successful: the malicious app hi- analysis reveals the key categories of link collisions, and jacked the plaintext parameters in the URL, and quickly we develop a systematic procedure to label all of them. bounced the user back to the original page in the browser. This analysis allows us to focus on the highly suspicious The bouncing is barely noticeable by users. groups that are involved in malicious hijacking. Finally,

9 100 100 Scheme Apps Web Host Apps 95 95 F P 1278 © 480 file © 90 90 F P content © 727 441 © 85 85 T T 410 © oauth 520 © 80 80 T P x-oauthflow-twitter © 369 © 187 App2016 App2016 75 75 CDF of Apps (%) CDF of Apps (%) App2014 App2014 P 359 © 148 x-oauthflow-espn- 70 70 1 1000 100 10 1 1000 100 10 T © twitter # of Apps per Web Host # of Apps per Scheme T P 321 zxing © © 131 T T 278 © © testshop 126 T T 123 © 278 © shopgate-10006 # of Collision apps Figure 6: Figure 7: # of Collision apps T F 238 geo 112 © © per web host. per scheme. T T © 110 180 tapatalk-byo © Table 5: Top 10 schemes and app link hosts with link col- we present more in-depth case studies to understand the lisions in App2016. We manually label them into three risk of typical attacks. F P T = Functional, © © = Per-App, types: = Third-party © 6.1 Characterizing Link Collision by multiple apps. IANA [13] maintains a list of URI Links collision happens when two or more apps register schemes, most of which are functional ones. This the same deep link URIs. When the link is clicked, it is collision type does not apply to App links. possible for mobile OS to direct users to the wrong app. Per-app scheme/host (P) • is designated to an indi- Note that simply matching “scheme” or app link “host” vidual app. “ ” is to open Google For example, “ is not sufficient. ” and myapp://a/1 Maps (but registered by 186 other apps) and “ ” is fb ” do not conflict with each other since “ myapp://a/2 supposed to open Facebook app (but registered by they use different “paths” in the URI. To this end, we de- 4 other apps). Collisions on per-app schemes/hosts fine two apps have link collision only if there is at least are often malicious, with the exception if all apps are one link that is opened by both apps. from the same developer. is used by third- • Third-party scheme/host (T) Prevalence of Link Collisions. To identify link col- party libraries, which often leads to (uninten- lision, we first group apps based on the scheme (scheme ” tional) link collision. “ x-oauthflow-twitter URL) or web host (App links). Figure 6 and Figure 7 is a callback URL for Twitter OAuth. Twit- show the number of apps that each scheme/host is as- ter suggests developers defining their own call- sociated with. About 95% of schemes are exclusively back URL, but many developers copy-paste this registered by one single app. The percentage is slightly scheme from an online tutorial (unintentional colli- lower for App links (76%–82%). Then for each group, sion). “ ” is from a third- we filter out apps that have no conflicting URIs with any party RSS aggregator. Apps use this service to redi- other apps in the group, and produce apps with link colli- rect user RSS requests to their apps (benign colli- sions. Within App2014 , we identify 394 schemes, 1,547 sion). web hosts from 5,615 apps involved in link collisions. The corresponding numbers for 2016 are higher: 697 Because of the “shared” nature, functional schemes schemes and 3,272 web hosts from 8,961 apps. or third-party schemes/hosts are expected to be used by Our result is a lower bound of actual collisions, biased multiple apps. Related link collisions are benign or un- towards popular apps. Schemes/hosts that are currently intentional. In contrary, per-app schemes/hosts are (ex- mapped to a single app might still have collisions with pected to be) designated to each app, and thus link colli- apps outside of our dataset. For the rest of our analysis, sion can indicate malicious hijacking attempts. we focus on the more recent 2016 dataset. Categorizing Link Collisions. We find that not all collisions have malicious intention. After manually ana- 6.2 Detecting Malicious Hijacking lyzing these schemes and hosts, we categorize collisions per-app Next, we detect malicious hijacking by labeling into 3 types. Table 5 shows the top 10 mostly registered schemes/hosts. This task is challenging since schemes schemes/hosts and their labels. and hosts are registered without much restriction— it is difficult to tell based on the name of the Functional scheme (F) is reserved for a common • functionality, instead of a particular app. “ ” is file scheme/host. Our The high-level intuition is: 1) ” registered by 1,278 apps that can open files. “ geo third-party schemes/hosts often have official documen- is registered by 238 apps that can handle GPS coor- tations to teach developers how to use the library, dinates. These schemes are expected to be registered which are searchable online; 2) functional schemes are

10 Deep Links After Pre- Functional Third-party Per-app Link Collisions In Total Processing 18,839 (20,257) 376 (6,350) 30 (2,135) 197 (3,972) 149 (893) #Schemes (#Apps) 697 (7,432) #Hosts (#Apps) 3,272 (2,868) 2,451 (2,083) N/A 137 (999) 2,314 (1,593) 18,561 (8,878) Table 6: Filtering and classification results for schemes and App link hosts (App2016). 100 100 To these well-documented in public URI standard. 80 80 ends, we develop a filtering procedure to label per-app 60 60 schemes/hosts. For any manual labeling tasks, we have 40 40 two authors perform the task independently, and a third Per-app 20 20 Per-app Third-party CDF of Apps (%) CDF of Apps (%) Functional Third-party person to resolve any disagreements. 0 0 2 1000 100 10 1000 100 10 2 We start with the 697 schemes Pre-Processing. # of Collision Apps per Scheme # of Collision Apps per Host and 3,272 hosts (8,961 apps) that have link collisions in # of collision apps Figure 8: Figure 9: # of collision apps . We exclude schemes/hosts where all the colli- App2016 per host. per scheme. sion apps are from the same developer. This leaves 376 schemes and 2,451 web hosts for further labeling. ing SVM and Random Forests classifiers return an accu- We label schemes in two steps. Classifying Schemes. racy of 59% (SVM) and 62% (RF). If we only focus on The results are shown in Table 6. First, we filter out > , e.g. schemes that have a higher-level of collisions ( 4 functional schemes. IANA [13] lists 256 common URI developers), it returns a higher accuracy: 84% (SVM) schemes, among which there are a few per-apps scheme and 75% (RF). The accuracy is not high enough for prac- e.g. , “ spotify ”). We man- under “provisional” status ( tical usage. Intuitively, there are not many restrictions on ually filter them out and get 175 standard functional how developers register their URIs, and thus it is possible schemes. Matching this list with our dataset returns 30 that the patterns of per-app schemes are not that strong. functional schemes with link collisions. Then, to label Since fully automated classification is not yet feasi- third-party schemes, we manually search for their doc- ble, we then explore useful heuristics to help app mar- umentations or tutorials online. For certain third-party ket admins to conduct collision auditing. We rank fea- schemes, we also check the app code to be sure. In to- tures based on the information gain, and identify top 3 tal, we identify 197 third-party schemes, and the rest 149 features: average number of apps from the same devel- schemes are per-app schemes (also manually checked). oper (apDev), number of unique no-prefix components Figure 8 shows the number of collision apps for (npcNum) and number of unique components (ucNum). different schemes. Not surprisingly, per-app schemes Regarding apDev, the intuition is that developers are have fewer collision apps than functional and third-party likely to use a different per-app scheme for each of their schemes. , e.g. apps, but would share the same third-party schemes ( oauth ) for all their apps. A larger apDev of the colli- This only requires Classifying App Link Hosts. sion link indicates a higher chance of being a third-party labeling third-party hosts from per-app hosts. In total, scheme. Moreover, third-party schemes are likely to use there are 2,451 hosts after pre-processing. We observe the same component name for different apps ( , less i.e. that 1633 hosts are jointly registered by 5 apps, and 347 unique), leading to smaller npcNum and ucNum. ” are registered by 2 apps. subdomains of “ All these hosts are not third-party hosts, which helps to trim down to 471 hosts for manual labeling. We follow 6.3 Hijacking Results and Case Studies the same intuition to label third-party web hosts by man- ually searching their official documentations. In total, In total, we identify 149 per-app schemes and 2,314 per- we label 137 third-party hosts, and 2,314 per-app hosts. app hosts that are involved in link collisions. The related Figure 9 compares per-app hosts and third-party hosts on apps (893 and 1,593 respectively) are either the attacker their number of collision apps, which are very similar. or victim in the hijacking attacks. To understand how per-app schemes and hosts are hijacked, we perform in- Clearly manu- Testing Automated Classification. depth cases studies on a number of representative attacks. ally labeling cannot scale. Now that we have obtained We find apps that regis- Traffic Hijacking. the labels, we briefly explore the feasibility of automated ter popular websites’ links (or popular apps’ schemes) classification. As a feasibility test, we classify per-app seeking to redirect user traffic to themselves. For schemes from third-party schemes using 10 features such ” is registered by 480 apps example, “ as unique developers per scheme, and apps per scheme The scheme from 305 non-Google developers. (feature list in Appendix). 5-fold cross-validation us-

11 Intent URL Scheme URL App Link ” from Google Maps is hijacked “ google.navigation Dataset (Webpage) (Webpage) (Webpage) by 79 apps from 32 developers. The intuition is that 3.2M (480K) Alexa1M 431K (197K) 1,203 (452) popular sites and apps already have a significant num- ber of links distributed to the web. Hijacking their links Table 7: Number of deep links (and webpages that con- are likely to increase the attacker apps’ chance of being tain deep links) in Alexa top 1 million web domains. invoked. We find many popular apps are among the hi- e.g. jacking targets ( , Facebook, Airbnb, YouTube, Tum- adding credit card information. QatarTaxi (10K down- blr). Traffic hijacking is the most common attack. loads) registers to receive all “ careem://* ” deep links. A number of hijackings URL Redirector MITM. After code analysis, we find all these links redirect users are conducted by “URL Redirector” apps. When users to the QatarTaxi app’s home screen, as an attempt to draw click on an http/https link in the browser, these Redirec- customers. tor apps redirect users to the corresponding apps. Es- Bad Scheme Names. Hijackings are also caused by sentially, Redirector apps play the role of mobile OS in developers using easy-to-conflict scheme names. For ex- redirecting URLs, but their underlying mechanisms have ample, Citi Bank’s official app uses “ ” as its deeplink several security implications. For example, URLLander per-app scheme, which conflicts with 6 other apps. These ( ) and Ap- apps are not malicious, but may cause confusions — a pRedirect ( com.nevoxo.tapatalk.redirect ) each user is going to open the Citi Bank app, but a non-related has registered HTTPS links from 36 and 75 web domains app shows up (and vice versa). We detect 14 poorly respectively (unverified) and has over 10,000 installs. We e.g. , “ named per-app schemes ( myapp ”, “ app ”). suspect that users install Redirector apps because of the convenience, since these apps allow users to open the destination apps (without bouncing to the browser) even 7 Mobile Deep Links on The Web if the destination apps have not yet adopted App links. The redirection is hard coded without the consent of the Our analysis shows that hijacking risks still widely exist destination apps or the originated websites. within apps. Next, we move to the web-side to examine how mobile deep links are distributed on the web, and URL redirector apps can act as man-in-the-middle estimate the likelihood of users encountering hijacked (MITM) to hijack HTTP/HTTPS URLs. For example, Intent URL links. In addition, we focus on to examine URLLander registered “ ” its adoption and usage. We seek to estimate the impact for redirection. When a user visits us- of Intent URLs to mitigating hijacking threats. ing a browser (usually logged-in), the URL contains In the following, we first measure the prevalence of sensitive parameters including a SESSIONID. Once the Intent URLs on the web, and compare it with scheme user agrees to use URLLander for redirection, the URL URLs and App links. Then, we revisit the hijacked links and SESSIONID will be handed over to URLLander detected in 6 and analyze their appearance on the web. § by the browser in plaintext. This MITM threat applies to all the popular websites that Redirector apps reg- istered such as , and , 7.1 Intent URL Usage . Particularly for eBay, we find that the offi- cial eBay app explicitly does not register to open the link Intent URL is a secure way of calling deep links from ”, but this link was registered by “ websites by specifying the target app’s package name Redirector apps. We analyze the code of AppRedirect (unique identifier). In theory, Intent URL can be used and find it actually writes every single incoming URL to invoke existing app components defined by scheme and parameters in a log file. Redirection (and MITM) URLs (and even App links) to prevent hijacking. The can be automated without prompting users by exploiting key question is how widely are Intent URLs adopted in the over-permission vulnerability (see § 5.2) — if the user practice. once sets a preference for just one of those links. Intent URLs vs. Other Links We start by extracting mobile deep links from web pages in collected Alexa1M Many apps are Hijacking a Competitor’s App. 3. For App links and scheme URLs, we match all the § in competitors in the same business, and we find tar- hyperlinks in the HTML pages with the link registration For geted hijacking cases between competing apps. entries extracted from apps. We admit that this method ) and QatarTaxi com.careem.acma example, Careem ( is conservative as we only include deep links registered com.qatar.qatartaxi ) are two competing taxi book- ( by apps in our dataset. But the matching is necessary ing apps in Dubai. Careem is more popular (5M+ down- since not all the HTTP/HTTPS links or schemes on the ” for many function- loads), which uses scheme “ careem web can invoke apps. For Intent URLs, we identify them alities such as booking a ride (from hotel websites) and

12 0.14 25 50 0.12 20 40 0.1 15 0.08 30 0.06 10 20 0.04 5 10 0.02 0 0 0 70 75 80 85 90 95 100 5 100 95 5 10 15 20 25 30 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 90 85 80 35 40 45 50 55 60 65 10 15 20 25 30 35 40 45 50 55 60 65 70 75 % Websites w/ Deep Link % Websites w/ Deep Link % Websites w/ Deep Link Bins of Websites (in 10 Thousand) Bins of Websites (in 10 Thousand) Bins of Websites (in 10 Thousand) (b) Scheme URLs (a) Intent URLs (c) App Links Figure 10: Deep link distribution among Alexa top 1 million websites. Website domains are sorted and divided into 20 even-sized bins (50K sites per bin). We report the % of websites that contain deep links in each bin. 100 10000 1000 Third-party Third-party 456K 2620K Per-app Per-app 191K 80 Functional Functional 1000 398K 100 60 36K 122K 100 40 10 5.3K 20 7.2K CDF of App (%) Intent URL 10 3.4K 2.3K Scheme URL 0 # of Websites (Thousand) 1 2 10 100 1000 # of Deeplinks (Thousand) 1 1 AppLink AppLink Scheme URL Scheme URL # Web Domains per App Figure 13: Webpages that contain hi- Figure 12: Different type of hijacked Figure 11: Number of websites that jacked deep links in Alexa1M. deep links in Alexa1M. host deep links for each app. (50 websites for more than half of the apps). It is chal- ”). The based on their special format (“ intent://*;end lenging to remove or upgrade scheme URLs across all matching results are shown in Table 7. these sites. The key observation is Intent URLs are rarely used. Out of 1 million web domains, only 452 (0.05%) contain Among the 1,203 Insecure Usage of Intent URL. Intent URLs in their index page. As a comparison, App Intent URLs, we find 25 Intent URLs did not specify the links and Scheme URLs appear in 480K (48%) and 197K package name of the target app (only the host or scheme). (19.7%) of these sites. For the total number of links, In- These 25 Intent URLs can be hijacked. tent URL is also orders of magnitude lower than other links (1,203 versus 3.2M and 431K). This extremely low 7.2 Measuring Hijacking Risk on Web adoption rate indicates that Intent URLs have little im- pact to mitigating hijacking risks in practice. To estimate the level of hijacking risks on the web, we now revisit the hijacking attacks detected in 6 (those § Challenges to Intent URL Adoption. Since Android on per-app schemes/hosts). We seek to measure the vol- still supports scheme URLs, it is possible that developers ume of hijacked links among webpages, and estimation are not motivated to use Intent URLs to replace the still- App link’s contributions over existing risks introduced functional scheme URLs. In addition, even if security- by scheme URLs. aware developers use Intent URLs on their own websites, it is difficult for them to upgrade scheme URLs that have Hijacked Mobile Deep Links. We extract links from been distributed to other websites. Alexa1M that are registered by multiple apps, which re- As shown in Figure 10(a), Intent URLs are highly turns 408,455 scheme URLs and 2,741,817 App links. skewed towards to high-ranked websites. In contrary, Among them, 7,242 scheme URLs and 2,619,565 App Scheme URLs are more likely to appear in low-ranked i.e. links contain per-app schemes/hosts ( , hijacked links). domains (Figure 10(b)), and App links’ distribution is The key observation is that App links introduce orders relatively even (Figure 10(c)). A possible explanation is of magnitude more hijacked links than scheme URLs, as that popular websites are more security-aware. shown in Figure 12 (log scale y-axis). We further exam- ine the number of websites that contain hijacked links. Then we focus on apps, and examine how many web- As shown in Figure 13, App links have a dominating sites that contain an app’s deep links (Figure 11). We find contribution: 456K websites (out of 1 million, 45.6%) that most apps have their Intent URLs on a single website (90%). We randomly select 40+ pairs of the one-to-one contains per-app App links that are subject to link hi- jacking. The corresponding number for scheme URL is mapped apps and websites for manual examination. We 5.3K websites (0.5%). find that almost all websites (except 2) are owned by the App links, designed as the secure version of deep app developers, which confirms our intuition. Scheme links, actually expose users to a higher level of risks. In- URLs are found in more than 5 websites for 90% of apps

13 tuitively, http/https links have been used on the web for Legacy Issue. Android does not strongly enforce decades. Once apps register App links, a large number App link verification possibly due to the legacy issues. of existing http/https links on the web are automatically First, scheme URLs are still widely used on websites interpreted as App links. This creates more opportunities as discussed in § 7. Disabling scheme links altogether for malicious apps to perform link hijacking. would inevitably affect users’ web browsing experience ( e.g. , causing broken links [8]). Second, according to Links Carrying Sensitive Data. To illustrate the Google’s report [11], over 60% of Android devices are practical consequences of link hijacking, we perform a still using Android 5.0 or earlier versions, which do not quick analysis on the hijacked links with a focus on their support App link verification. Android allows apps (6.0 parameters. A quick keyword search returns 74 sen- or higher) to use verified App links while maintaining e.g. , sitive parameter names related to authentications ( backward compatibility by not enforcing the verification. , token , , sessionid , access password authToken full list in Appendix). We find that 1075 hijacked links Countermeasures. We discuss three countermea- contain at least one of the sensitive parameters. A suc- sures to mitigate link hijacking risks. In the short term, cessful hijacking will expose these parameters to the at- the most effective countermeasures would be disabling tacker app. This is just one example, and by no means scheme URLs in mobile browsers and WebViews. Note exhaustive in terms of possibly sensitive data carried in that this is not to disable the app interfaces defined by , PII, location). e.g. hijacked links ( schemes, but to encourage (force) websites to use Intent URLs to invoke per-app schemes safely. Android may also whitelist a set of well-defined functional schemes 8 Discussion to avoid massively breaking functional links. For cus- tomized scheme URLs that are still used on the web, Key Implications. Our results shed light on the prac- Android needs to handle their failure gracefully without tical challenges to mitigate vulnerable mobile deep links. severely degrading user experience. Second, prohibit- First, scheme URL was designed for mixed purposes, ing apps from opening unverified App links to prevent including invoking a generic function (functional/third- link hijacking. The drawback is that apps without a web party schemes) and launching a target app (per-app front would face difficulties to use deep links — they will schemes). The multipurpose design makes it difficult need to rely on third-party services such as [4] , associating e.g. to uniformly enforce security policies ( or Firebase [5] to host their association files. Third, schemes to apps). A more practical solution should pro- addressing the over-permission vulnerability ( 5.2), by § hibit per-app schemes, while not crippling the widely de- e.g. adopting more fine-grained preference setting ( , at the ployed functional/third-party schemes on the web. host level or even the link level). This threat would also Second, App links and Intent URLs were designed go away if Android strictly enforces App link verifica- with security in mind. However, their practical usage tions. has deviated from the initial design. Particularly for App links, 98% of apps did not implement link verification Our study Vulnerability Notification & Mitigation. correctly. In addition to various configuration errors, a identifies new vulnerabilities and attacks, and we are tak- more important reason is unverified links still work on ing active steps to notifying the related parties for the risk Android, and developers are likely not motivated to ver- mitigation. ify links. As a result, App links not only fail to provide better security, but worsen the situation significantly by First, regarding the over-permission vulnerability, we introducing more hijackable links. have filed a bug report through Google’s Vulnerability Finally, the insecurity of deep links leads to a tough Reward Program (VRP) in February 2017. As of June trade-off between security and usability. Mobile deep 2017, we have established a case and submitted the sec- links were designed for usability, to enable seamless ond round of materials including the proof-of-concept context-aware transitions from web to apps. However, app and a demo of the attack. We are waiting for further due to the insecure design, mobile platforms have to con- responses from Google. Second, we have reported our stantly prompt users to confirm the links they clicked, findings to the Android anti-malware team and the Fire- which in turn hurts usability. The current solution for base team regarding the massive unverified App links and Android (and iOS) takes a middle ground, by letting the misconfiguration issues. Details regarding their miti- users set “preference” for certain apps to disable prompt- gation plan, however, were not disclosed to us. Third, as ing. We find this leads to new security vulnerabilities § shown in 5.1, most of the misconfigured App links have § (over permission risk in 5.2) that allow malicious apps not been fixed after 5 months. In the next step, we plan to hijack arbitrary HTTP/HTTPS URLs in the Android to contact the developers, particularly those of hijacked browser. apps and help them to mitigate the configuration errors.

14 Limitations. Our study has a few limitations. First, browser using XSS [27, 50] and origin-crossing [52]. our conclusions are limited to mobile deep links of An- The threat also applies to customized in-app browsers droid. Although iOS takes a more strict approach to en- (called WebView) [20, 37, 40, 51]. In our work, we focus forcing the link verification, it remains to be seen how hijacking threats to apps, a different threat model where well the security guarantees are achieved in practice. Our browser is the not target. § 5.1 already shows that iOS uni- brief measurement in Detection and Mitigation. Existing research has ex- versal links also have misconfigurations. More exten- plored different approaches to detect vulnerabilities in sive measurements are needed to fully understand the app-to-app communications. On one hand, static code potential security risks of iOS deep links. Second, our analysis leverages call graphs and flow analysis to de- measurement scope is still limited comparing to the size tect information leakages [15, 26, 36, 45, 57] and vul- of Android app market and the whole web. We argue nerable interfaces for inter-app communications [14, 32, that data size is sufficient to draw our conclusions. By 33, 34, 41, 42, 43]. On the other hand, dynamic anal- measuring the most popular apps (160,000+) and web ysis tracks information flow in the runtime which can domains (1,000,000), we collect strong evidence on the capture attacks that would be otherwise missed by static incompetence of the newly introduced linking mecha- analysis [24, 28, 30, 54, 56]. To remove and miti- nisms in providing better security. Third, we only fo- gate vulnerabilities, researchers propose to automatically cus on the link hijacking threat, because this is the se- generate app patches [39, 45, 58], enforce strict poli- curity issue that App links and Intent URLs were de- cies [16, 17, 31, 51, 59] and provide guidelines for writ- signed to address. There are other threats related to web- ing safer apps [31]. Our work highlights the significant to-mobile communications such as exploiting WebViews gap between a security solution and the practical impact and browsers [20, 37], and cross-site request forgery on in mitigating threats. Beyond technical solutions, other apps [27, 46, 50]. Our work is complementary to existing factors such as developer incentives and capabilities and work to better understand and secure the web-and-app mobile platform policies also play a big role. ecosystem. 9 Related Work 10 Conclusion Inter-app Communication & Deep Links. Re- In this paper, we conducted the first large-scale measure- searchers have discovered various vulnerabilities in the ment study on mobile deep links across popular Android inter-app communication mechanism in Android [19, 23] apps and websites. Our results showed strong evidence and iOS [52], which leads to potential hijacking and that the newly proposed deep link methods (App links spoofing attacks. The fundamental issue is a lack of and Intent URLs) fail to address the existing hijacking source and destination authentication [52]. In the con- risks in practice. In addition, we identified new vul- text of app-to-app communication, attacks may cause nerabilities and empirical misconfigurations in App links permission escalation [15, 21] and sensitive data leak- which ultimately expose users to a higher level of risks. , scheme URL) inherent e.g. age [46]. Mobile deep links ( Finally, we made a list of suggestions to countermeasure some of these vulnerabilities when facilitating commu- the link hijacking risks in Android. Moving forward, we nications between websites and apps. Unlike web URLs plan to further investigate automated methods for hijack- whose uniqueness is guaranteed by the DNS, mobile ing detection, and conduct more extensive measurements deep links lack a similar, centralized entity for link reg- on iOS deep links in the future. istration and addressing. As a result, multiple apps may register the same link, leading to hijacking risks. Our work is complementary to existing work since we focus on large-scale empirical measurements, providing new Acknowledgments understandings to how the risks are mitigated in practice. Other recent works on mobile deep links focus on im- The authors wish to thank the anonymous reviewers and proving usability instead of security. Two systems are our shepherd Manuel Egele for their helpful comments, proposed to automatically generate deep links for apps and Bolun Wang for sharing the scripts to collect the via static and dynamic code analysis [38, 49]. meta data of Android apps. This project was supported Mobile Browser Security. In web-to-app communi- by NSF grant CNS-1717028. Any opinions, findings, cations, mobile browsers play an important role in bridg- and conclusions or recommendations expressed in this ing websites and apps, which can also be the target of material are those of the authors and do not necessarily attacks. For example, malicious websites may attack the reflect the views of any funding agencies.

15 [14] B AGHERI , H., S ADEGHI , A., G ARCIA , J., AND References , S. COVERT: Compositional analysis of ALEK M . [1] Alexa. IEEE Trans- Android inter-app permission leakage. (2015). actions in Software Engineering Chrome. with Intents https: [2] Android // [15] B OSU ANG IU , F., Y AO , D. D., , G. AND W , A., L android/intents . Collusive data leak and more: Large-scale threat Proc. of analysis of inter-app communications. In [3] App programming guide for ios. https: ASIACCS (2017). // content/documentation/iPhone/ [16] B AVI MITRIENKO , A., F IS - , L., D , S., D UGIEL Conceptual/iPhoneOSProgrammingGuide/ S AND CHER , T., , A.-R. XManDroid: A ADEGHI Inter-AppCommunication/Inter- new Android evolution to mitigate privilege esca- . AppCommunication.html Technische Universit lation attacks. at Darmstadt, ̈ Technical Report TR-2011-04 (2011). [4] Branch. . , S., H [17] B UGIEL S EUSER , S., AND ADEGHI , A.-R. [5] Firebase App Indexing. https://firebase. Flexible and fine-grained mandatory access control . on Android for diverse security and privacy poli- Proc. of USENIX Security cies. In (2013). [6] Handling App Links. https://developer. , S., T IAN , Y., [18] C , Y., C EI , E. Y., P HEN HEN html . AGUE OTCHER , R., AND , P. Oauth demysti- K T fied for mobile application developers. In Proc. of Apps. [7] Interacting Other with https: CCS (2014). // basics/intents/filters.html . ELT HIN [19] C , E., F AND , K., REENWOOD , A. P., G , D. Analyzing inter-application commu- AGNER W [8] iOS 9.2 Update: The Fall of URI Schemes and nication in Android. In (2011). Proc. of MobiSys the Rise of Universal Links. https://blog. AND , D. Bifocals: Analyzing AGNER W [20] C , E., HIN uri-scheme-and-universal-links/ . webview vulnerabilities in Android applications. In (2014). Proc. of WISA Links. https:// Universal [9] Support [21] D , A.-R., ADEGHI , A., S MITRIENKO , L., D AVI documentation/General/Conceptual/ , M. Privilege escalation attacks INANDY W AND . AppSearch/UniversalLinks.html Proc. of ISC (2011). on Android. In [10] Smartphone apps crushing mobile web times. GELMAN , L. F., RANOR , J. ONG H [22] E , S., C AND You’ve been warned: An empirical study of the ef- Smartphone-Apps-Crushing-Mobile-Web- fectiveness of web browser phishing warnings. In , October 2016. Time/1014498 (2008). Proc. of CHI [11] Android platform versions. https://developer. YDER R AND , D., AO , K. O., Y LISH [23] E , B. G. On the need of precise inter-app ICC classification for html , May 2017. detecting Android malware collusions. In Proc. of MoST (2015). , D., , A. P. Alice in warning- KHAWE [12] A AND F ELT land: A large-scale field study of browser security [24] E NCK , ILBERT , P., H AN , S., T ENDULKAR , W., G Proc. of USENIX Security warning effectiveness. In - C , J., M UNG , L. P., J OX V., C HUN , B.-G., C (2013). HETH S AND , P., ANIEL D , A. N. TaintDroid: an information-flow tracking system for realtime pri- [13] A , I. A. N. Uniform re- UTHORITY ACM TOCS 32 vacy monitoring on smartphones. , http: source identifier (URI) schemes. 2 (2014), 5. // schemes/uri-schemes.xhtml , February 2017. [25] E NGLEHARDT , S., AND N ARAYANAN , A. Online tracking: A 1-million-site measurement and analy- sis. In Proc. of CCS (2016).

16 ORDON [26] G IM , D., P ERKINS , J. H., , M. I., K U [38] M U , Y., Y IU , Z., L , , R., H U , X., D IU , Y., L A INARD , M. C. ILHAM GUYEN , N., AND R , L., N G M., AND H UANG , G. DroidLink: Automated Information flow analysis of Android applications CoRR generation of deep links for Android apps. Proc. of NDSS in DroidSafe. In (2015). (2016). abs/1605.06928 AND , R., , Y. Android browser cross- MIT A [27] H AY BERHEIDE , J., R ULLINER [39] M OBERTSON , C., O , application scripting (cve-2011-2357). Tech. rep., K IRDA , E. PatchDroid: Scalable third- W., AND July 2011. Proc. party security patches for Android devices. In of ACSAC (2013). , O., AND P ISTOIA , R., T , M. Dynamic AY RIPP [28] H ́ detection of inter-application communication vul- , P., D , J., [40] M UTCHLER , A., M OUP ITCHELL E Proc. of ISSTA nerabilities in Android. In (2015). IGNA K RUEGEL , C., AND V , G. A large-scale study of mobile web app security. In Proc. of IEEE NTERNATIONAL D [29] I ORPORATION C ATA (2015). MoST (IDC). Smartphone OS Market Share. http: // CTEAU C [41] O , M., M - , S., D HA , D., J ERING market-share.jsp , November 2016. LEIN , J., AND , L., K D ANIEL , P., B ARTEL , A., L I RAON T E Combining static analysis with , Y. L ́ , G.-J., D , Y., A ING [30] J OUP HN , J. H. I Y AND , A., E probabilistic models to enable market-scale An- Checking intent-based communication in Android Proc. of POPL droid inter-component analysis. In with intent space analysis. In Proc. of ASIACCS (2016). (2016). D ARTEL HA , P., J ANIEL , S., B C CTEAU [42] O , , D., M , AND W AGNER [31] K ANTOLA , D., C HIN , E., H E , W., AND ODDEN A., B , E., K LEIN , RAON T E L , J., D. Reducing attack surfaces for intra-application Y. Effective inter-component communication map- communication in Android. In Proc. of SPSM ping in Android: An essential step towards holis- (2012). Proc. of USENIX Security tic security analysis. In (2013). IA [32] K LIEBER , W., F LYNN , L., B HOSALE , A., J , L., B AND AUER , L. Android taint flow analysis for , A., [43] R AVITCH , T., C RESWICK , E. R., T OMB (2014). Proc. of SOAP app sets. In , L. F OLTZER , A., E LLIOTT , T., AND C ASBURN Multi-App security analysis with FUSE: Statically ARTEL [33] L I , L., B , A., B ISSYANDE , T. F. Proc. of In detecting Android app collusion. , S., LEIN D. A., K T RZT , Y., A E , J., L RAON (2014). PPREW , E., O CTEAU R ASTHOFER , S., B ODDEN , D., , P. D AND M IccTA: detecting inter- ANIEL C , D. Digital strategy: Why native apps OWINSKI [44] R component privacy leaks in Android apps. In Proc. https: versus mobile web is a false choice. (2015). of ICSE // apps-versus-mobile-web-decision/ , AI , F., C L [34] L IU AO - , H., W ANG , G., Y , D. D., E September 2016. ISH , K. O., AND , B. G. MR-Droid: A YDER R scalable and prioritized analysis of inter-app com- ˆ , D., B IRLEA UARNIERI B [45] S URKE , S., , M. G., G Proc. of MoST munication risks. In (2017). AND S ARKAR , V. Automatic de- P , M., ISTOIA tection of inter-application permission leaks in An- IU ERMUDEZ , H. H., B ONG , Y., S , [35] L ISLOVE , I., M droid applications. IBM Journal of Research and A., B ALDI , M., AND T ONGAONKAR , A. Identify- Development 57 , 6 (2013), 10–1. Proc. ing personal information in internet traffic. In of COSN (2015). [46] S - , R., Z HANG , K., Z HOU , X., I NT CHLEGEL ANG , A., , X. Sound- W APADIA , M., K AND WALA U , [36] L , L., L I , Z., W U , Z., L EE , W., AND J IANG comber: A stealthy and context-aware sound trojan G. CHEX: Statically vetting Android apps for com- Proc. of NDSS for smartphones. In (2011). Proc. of CCS ponent hijacking vulnerabilities. In (2012). AND , N. IKIFORAKIS [47] S TAROV , O., G ILL , P., N Are you sure you want to contact us? quantify- , H., D U , W., W ANG , Y., AND Y IN , [37] L UO , T., H AO ing the leakage of pii via website contact forms. In H. Attacks on webview in the Android system. In Proc. of PETS (2016). (2011). Proc. of ACSAC

17 [48] S : T . Number TATISTA HE STATISTICS PROTAL Appendix of available applications in the Google Play store http: from December 2009 to February 2016. Table 8 shows a Features for Classifying Schemes. // list of features for classifying per-app schemes and third- number-of-available-applications-in- § 6. These features are selected based party schemes in the-google-play-store/ , 2016. on the intuition that third-party schemes are likely to be used by a larger variety of apps and developers, but are , O ZIM , S. N. uLink: [49] T RIANA R IVA ANZIRUL A used for similar components in the third-party library Enabling user-defined deep linking to app content. Table 9 Sensitive Mobile Deep Link Parameters . In Proc. of Mobisys (2016). is a list of sensitive parameters identified in the mobile , T. Attacking Android browsers via intent ERADA [50] T deep links from Alexa top 1 million websites. We ex- scheme urls. Tech. rep., March 2014. clusively focus on link parameters that are related to au- § thentication. These parameter names are used in 7 to EMETRIOU UNTER [51] T , G AND , S., UNCAY , G. S., D match hijacked deep links that carry sensitive data. We Draco: A system for uniform and fine- C. A. obtain this list by keyword searching and manual anno- grained access control for web code on Android. tation. This is by no means an exhaustive list. The goal (2016). In Proc. of CCS is provide examples to illustrate practical consequences , R., X [52] W ANG ING , L., W ANG , X., , S. AND C HEN of link hijacking attacks. Unauthorized origin crossing on mobile platforms: Description Feature Threats and mitigation. In Proc. of CCS (2013). aNum Total # of apps ILLER AND G ARFINKEL , S. L. , R. C., , M., M U [53] W # of developers uDev Do security toolbars actually prevent phishing at- Total # of components cNum tacks? In (2006). Proc. of CHI ucNum # of unique components utcNum # of unique third-party components YU AND L IU , [54] X IA , M., G ONG , L., L , Z., , Y., Q I npcNum # of unique components name (no prefix) X. Effective real-time android application auditing. tDev # of developers with third-party components In Proc. of IEEE S&P (2015). Average # of apps of the same developer apDev tDevP % of third-party developers [55] X ING , L., B AI , X., L I , T., W ANG , X., C HEN , K., % of unique components ucP U H AND AN , X. Cracking L IAO , X., H , S.-M., app isolation on apple: Unauthorized cross-app re- Table 8: Features used for scheme classification. source access on MAC OS X and iOS. In Proc. of CCS (2015). key, apikey, apiTo- access token, actionToken, api token, key, auth ken, Auth, auth token, authenticity , L., HOU , Y., Z ANG , J., W HUGE , K., Z ANG [56] Y authkey, authToken, autologin, AWSAccessKeyId, UAN D AND , H. IntentFuzzer: Detecting capability cookie, csrf token, csrfKey, csrfToken, ctoken, leaks of Android applications. In Proc. of ASIACCS fk id, FKSESSID, FOGSESSID, force sid, session (2014). formkey, gsessionid, guestaccesstoken, hkey, IK- SESSID, imprToken, jsessionid, key, keycode, keys, ANG , M., Z HANG , Y., G U , G., [57] Y ANG , Z., Y LLSES- token, configurator live LinkedinToken, ANG AND N ING , P., W , X. S. AppIntent: analyzing SID, MessageKey, mrsessionid, navKey, newsid, sensitive data transmission in Android for privacy oauth callback, oauth key, token, pasID, pass, pass Proc. of CCS leakage detection. In (2013). plkey, password, PHPSESSID, piggybackCookie, redir key, key, roken2, seasonid, secret token, reward AppSealer: Auto- [58] Z HANG , M., AND Y IN , H. key, sesid, SESS, sessid, ses- token, ses perk secret matic generation of vulnerability-specific patches sid2b4f0b11dea2f7ae4bfff49b6307d50f, SESSION, for preventing component hijacking attacks in An- id, session session rikey, sessionGUID, sessionid, Proc. of NDSS droid applications. In (2014). sh auth, sharedKey, SID, tok, token, uepSessionToken, id, wmsAuthSign, ytsession session vt , Y., Y U , G., AND HANG HEN , ANG [59] Z , M., G C H. FineDroid: Enforcing permissions with system- Table 9: Sensitive parameters in mobile deep links. wide application execution context. In Proc. of Se- cureComm (2015).

Related documents