preinstalledAndroidSW preprint


1 1 An Analysis of Pre-installed Android Software ‡ † ∗§ ∗† † , Mohammed Rashed , Juan Tapiador Julien Gamba , Abbas Razaghpanah and Narseo Vallina-Rodriguez ‡ § ∗ † Stony Brook University, ICSI Universidad Carlos III de Madrid, IMDEA Networks Institute, Abstract end up packaged together in the firmware of a device is not transparent, and various isolated cases reported over the last The open-source nature of the Android OS makes it possible few years suggest that it lacks end-to-end control mechanisms for manufacturers to ship custom versions of the OS along with to guarantee that shipped firmware is free from vulnerabili- a set of pre-installed apps, often for product differentiation. ties [24], [25] or potentially malicious and unwanted apps. For Some device vendors have recently come under scrutiny et al. [82], [47] example, at Black Hat USA 2017, Johnson for potentially invasive private data collection practices and gave details of a powerful backdoor present in the firmware other potentially harmful or unwanted behavior of the pre- of several models of Android smartphones, including the installed apps on their devices. Yet, the landscape of pre- popular BLU R1 HD. In response to this disclosure, Amazon installed software in Android has largely remained unexplored, removed Blu products from their Prime Exclusive line-up [2]. particularly in terms of the security and privacy implications of A company named Shanghai Adups Technology Co. Ltd. was such customizations. In this paper, we present the first large- pinpointed as responsible for this incident. The same report scale study of pre-installed software on Android devices from also discussed the case of how vulnerable core system services more than 200 vendors. Our work relies on a large dataset , the widely deployed MTKLogger component developed ( e.g. of real-world Android firmware acquired worldwide using by the chipset manufacturer MediaTek) could be abused by crowd-sourcing methods. This allows us to answer questions co-located apps. The infamous Triada trojan has also been related to the stakeholders involved in the supply chain, from recently found embedded in the firmware of several low-cost device manufacturers and mobile network operators to third- Android smartphones [77], [66]. Other cases of malware found party organizations like advertising and tracking services, and pre-installed include Loki (spyware and adware) and Slocker social network platforms. Our study allows us to also uncover (ransomware), which were spotted in the firmware of various relationships between these actors, which seem to revolve high-end phones [6]. primarily around advertising and data-driven services. Overall, Android handsets also play a key role in the mass-scale the supply chain around Android’s open source model lacks data collection practices followed by many actors in the dig- transparency and has facilitated potentially harmful behaviors ital economy, including advertising and tracking companies. and backdoored access to sensitive data and services with- OnePlus has been under suspicion of collecting personally out user consent or awareness. We conclude the paper with identifiable information (PII) from users of its smartphones recommendations to improve transparency, attribution, and through exceedingly detailed analytics [55], [54], and also de- accountability in the Android ecosystem. ploying the capability to remotely root the phone [53], [52]. In NTRODUCTION I. I July 2018 the New York Times revealed the existence of secret The openness of the Android source code makes it possible agreements between Facebook and device manufacturers such for any manufacturer to ship a custom version of the OS along as Samsung [32] to collect private data from users without their with proprietary pre-installed apps on the system partition. knowledge. This is currently under investigation by the US Most handset vendors take this opportunity to add value to Federal authorities [33]. Additionally, users from developing their products as a market differentiator, typically through countries with lax data protection and privacy laws may be at partnerships with Mobile Network Operators (MNOs), online an even greater risk. The Wall Street Journal has exposed the social networks, and content providers. Google does not forbid presence of a pre-installed app that sends users’ geographical this behavior, and it has developed its Android Compatibility location as well as device identifiers to GMobi, a mobile- Program [8] to set the requirements that the modified OS must advertising agency that engages in ad-fraud activities [14], fulfill in order to remain compatible with standard Android [67]. Recently, the European Commission publicly expressed apps, regardless of the modifications introduced.Devices made concern about Chinese manufacturers like Huawei, alleging by vendors that are part of the Android Certified Partners that they were required to cooperate with national intelligence program [5] come pre-loaded with Google’s suite of apps services by installing backdoors on their devices [30]. e.g. , the Play Store and Youtube). Google does not provide ( Research Goals and Findings details about the certification processes. Companies that want To the best of our knowledge, no research study has so to include the Google Play service without the certification far systematically studied the vast ecosystem of pre-installed can outsource the design of the product to a certified Original Android software and the privacy and security concerns asso- Design Manufacturer (ODM) [7]. Certified or not, not all pre-installed software is deemed as ciated with them. This ecosystem has remained largely unex- wanted by users, and the term “bloatware” is often applied plored due to the inherent difficulty to access such software to such software. The process of how a particular set of apps at scale and across vendors. This state of affairs makes such

2 2 i ) these apps – typically an study even more relevant, since app ecosystem as a whole [78], [84], [85], we find that unavailable on app stores – have mostly escaped the scrutiny of it is also quite prevalent in pre-installed apps. We have ) regular users are unaware researchers and regulators; and ii identified instances of user tracking activities by pre- of their presence on the device, which could imply lack of installed Android software – and embedded third-party consent in data collection and other activities. libraries – which range from collecting the usual set of PII and geolocation data to more invasive practices that include In this paper, we seek to shed light on the presence and personal email and phone call metadata, contacts, and a behavior of pre-installed software across Android devices. In variety of behavioral and usage statistics in some cases. particular, we aim to answer the questions below: We also found a few isolated malware samples belonging to What is the ecosystem of pre-installed apps, including all • known families, according to VirusTotal, with prevalence in actors in the supply chain? , Xynyin, SnowFox, Rootnik, Triada e.g. the last few years ( • What are the relationships between vendors and other stake- and Ztorg), and generic trojans displaying a standard set holders ( , MNOs and third-party services)? e.g. of malicious behaviors ( , silent app promotion, SMS e.g. Do pre-installed apps collect private and personally- • fraud, ad fraud, and URL click fraud). identifiable information (PII)? If so, with whom do they All in all, our work reveals complex relationships between share it? actors in the Android ecosystem, in which user data seems Are there any harmful or other potentially dangerous apps • to be a major commodity. We uncover a myriad of actors among pre-installed software? involved in the development of mobile software, as well as To address the points described above, we developed a poor software engineering practices and lack of transparency in research agenda revolving around four main items: the supply chain that unnecessarily increase users’ security and 1) We collected the firmware and traffic information from privacy risks. We conclude this paper with various recommen- § II). We real-world devices using crowd-sourcing methods ( dations to palliate this state of affairs, including transparency obtained the firmware from 2,748 users spanning 1,742 models to improve attribution and accountability, and clearer device models from 214 vendors. Our user base covers mechanisms to obtain informed consent. Given the scale of 130 countries from the main Android markets. Our dataset the ecosystem and the need to perform manual inspections, contains 424,584 unique firmware files, but only 9% of the we will gradually make our dataset available to the research collected APKs were found in Google Play. We comple- community and regulators to boost investigations. ment this dataset with traffic flows associated with 139,665 unique apps, including pre-installed ones, provided by over II. D OLLECTION C ATA 20.4K users of the Lumen app [86] from 144 countries. To Obtaining pre-installed apps and other software artifacts the best of our knowledge, this is the largest dataset of e.g. , certificates installed in the system root store) at scale is ( real-world Android firmware analyzed so far. challenging. As purchasing all the mobile handset models (and 2) We performed an investigation of the ecosystem of pre- their many variations) available in the market is unfeasible, III) by § installed Android apps and the actors involved ( we decided to crowdsource the collection of pre-installed analyzing the Android manifest files of the app packages, software using a purpose-built app: Firmware Scanner [34]. their certificates, and the Third-Party Libraries (TPLs) they Using Firmware Scanner, we obtained pre-installed software use. Our analysis covers 1,200 unique developers associ- from 1,742 device models. We also decided to use Lumen, ated with major manufacturers, vendors, MNOs, and Inter- an app that aims to promote mobile transparency and enable net service companies. We also uncover a vast landscape of user control over their mobile traffic [86], [49] to obtain third-party libraries (11,665 unique TPLs), many of which anonymized network flow metadata from Lumen’s real users. mainly provide data-driven services such as advertisement, This allows us to correlate the information we extract from analytics, and social networking. static analysis, for a subset of mobile apps, with realistic 3) We extracted and analyzed an extensive set of custom network traffic generated by mobile users in the wild and permissions (4,845) declared by hardware vendors, MNOs, captured in user-space. In the remainder of this section, we third-party services, security firms, industry alliances, explain the methods implemented by each app and present chipset manufacturers, and Internet browsers. Such permis- our datasets. We discuss the ethical implications of our data sions may potentially expose data and features to over-the- collection in Section II-C. top apps and could be used to access privileged system A. Firmware Scanner resources and sensitive data in a way that circumvents the Android permission model. A manual inspection reveals a Publicly available on Google Play [34], Firmware Scanner complex supply chain that involves different stakeholders is a purpose-built Android app that looks for and extracts § IV). and potential commercial partnerships between them ( pre-installed apps and DEX files in the priv-app and app 4) We carried out a behavioral analysis of nearly 50% of the lib64 and lib , libraries in the /system/ folders located in apps in our dataset using both static and dynamic analysis folders in /system/vendor/ /system/ , any files in the tools ( V). Our results reveal that a significant part of § folder if that directory exists, and root certificates located the pre-installed software exhibit potentially harmful or . We can distin- in /system/etc/security/cacerts/ unwanted behavior. While it is known that personal data guish pre-installed apps from user-installed ones as the latter collection and user tracking is pervasive in the Android are stored in /data/app/ . In order to reduce the scanning

3 3 Major versions 4 3 5 10 6 7 2 8 10 9 1 10 Files types Number of files (log scale) Apps Libs 0 10 Certs tcl bq lge zte blu rca htc alps vivo acer wiko asus sony oppo leeco nokia tecno honor meizu advan archos lenovo xiaomi google gionee allview huawei verizon doogee positivo oneplus coolpad amlogic motorola samsung allwinner vodafone metropcs smartfren blackview micromax blackberry softwinners Vendor Figure 1: Number of files per vendor. We do not display the vendors for which we have less than 3 devices. avoid introducing any bias in our results, we exclude 321 and upload time, Firmware Scanner first computes the MD5 4 e.g. hashes of the relevant files ( , apps, libraries, and root potentially rooted handsets from our study. certificates) and then sends the list of these hashes to our B. Lumen server. Only those missing in our dataset are uploaded over Lumen is an Android app available on Google Play that a Wi-Fi connection to avoid affecting the user’s data plan. aims to promote mobile transparency and enable user control Dataset: Thanks to 2,748 users who have organically installed over their personal data and traffic. It leverages the Android Firmware Scanner, we obtained firmware versions for 1,742 VPN permission to intercept and analyze all Android traffic in 2 1 branded by 214 vendors unique device models as summa- user-space and in-situ, even if encrypted, without needing root rized in Table I. Our dataset contains 424,584 unique files permissions. By running locally on the user’s device, Lumen (based on their MD5 hash) as shown in Figure 1 for selected is able to correlate traffic flows with system-level information vendors. For each device we plot three dots, one for each type and app activity. Lumen’s architecture is publicly available and of file, while the shape indicates the major Android version described in [86]. Lumen allows us to accurately determine 3 that the device is running. The number of pre-installed files which app is responsible for an observed PII leak from the varies greatly from one vendor to another. Although it is not vantage point of the user and as triggered by real user and surprising to see a large amount of native libraries due to device stimuli in the wild. Since all the analysis occurs on the hardware differences, some vendors embed hundreds of extra device, only processed traffic metadata is exfiltrated from the , “ .apk ” files) compared to other manufacturers apps ( i.e. device. running the same Android version. For the rest of our study, we Dataset: For this study, we use anonymized traffic logs focus on 82,501 Android apps present in the dataset, leaving provided by over 20.4K users from 144 countries (according the analysis of root certificates and libraries for future work. to Google Play Store statistics) coming from Android phones Our user-base is geographically distributed across 130 coun- manufactured by 291 vendors. This includes 34,553,193 traffic tries, yet 35% of our users are located in Europe, 29% in flows from 139,665 unique apps (298,412 unique package America (North and South), and 24% in Asia. Further, up to name and version combinations). However, as Lumen does not 25% and 20% of the total number of devices in our dataset collect app fingerprints or hashes of files, to find the overlap belong to Samsung and Huawei ones, respectively. This is co- between the Lumen dataset and the pre-installed apps, we herent with market statistics available online [35], [10]. While match records sharing the same package name, app version, both manufacturers are Google-certified vendors, our dataset and device vendor as the ones in the pre-installed apps dataset. also contains low-end Android devices from manufacturers While this method does not guarantee that the overlapping targeting markets such as Thailand, Indonesia, and India – apps are exactly the same, it is safe to assume that phones many of these vendors are not Google-certified. Finally, to that are not rooted are not shipped with different apps under the same package names and app versions. As a result, we have 1 We use the MD5 hash of the IMEI to uniquely identify a user, and the 1,055 unique pre-installed app/version/vendor combinations build fingerprint reported by the vendor to uniquely identify a given device present in both datasets. model. Note that two devices with the same fingerprint may be customized and therefore, have different apps pre-installed. C. Ethical Concerns 2 We rely on the vendor string self-reported by the OS vendor, which could Our study involves the collection of data from real users be bogus. For instance, Alps rebrands as “iPhone” some of its models, which, according to information available online, are Android-based replicas of iOS. who organically installed Firmware Scanner or Lumen on 3 We found that 5,244 of the apps do not have any activity, service, or their devices. Therefore, we follow the principles of informed , e.g. receiver. These apps may potentially be used as providers of resources ( consent [76] and we avoid the collection of any personal or images, fonts) for other apps. 4 We consider that a given device is rooted according to three signals. First, sensitive data. We sought the approval of our institutional when Firmware Scanner has finished the upload of pre-installed binaries, Ethics Board and Data Protection Officer (DPO) before start- the app asks the user whether the handset is rooted according to their own ing the data collection. Both tools also provide extensive understanding (note that the user may choose not to answer the question). As a complement, we use the library RootBeer [63] to progammatically check if privacy policies in their Google Play profile. Below we discuss a device is rooted or not. If any of these sources indicates that the device is details specific to each tool. potentially rooted, we consider it as such. Finally, we discard devices where The app collects some metadata about Firmware Scanner: , LineageOS). there is evidence of custom ROMs having been installed ( e.g. the device to attribute observations to manufacturers ( e.g. , its We discuss the limitations of this method in Section VI.

4 4 DEX Certified Root certs Files Device Apps Files Libs Apps Users Vendor Country (med.) (med.) (med.) (total) (med.) (total) Fingerprints partner (med.) Yes 441 924 868 136 556 83 150 260,187 29,466 Samsung South Korea Yes 343 716 1,084 68 766 96 146 150,405 12,401 Huawei China 89 South Korea 154 675 84 385 74 150 58,273 3,596 LGE Yes China No 65 136 632 56 385 46 148 29,288 2,883 Alps Mobile US/China Yes 110 801 127 454 62 151 28,291 2,158 50 Motorola 22% 1,742 2,748 424,584 82,501 — Total (214 vendors) Table I: General statistics for the top-5 vendors in our dataset. model and fingerprint) along with some data about the pre- automatically sign apps in development environments, hence installed applications (extracted from the Package Manager), enabling other apps signed with that certificate to access its network operator (MNO), and user (the timezone, and the functionality without requesting any permission. Most app MCC and MNC codes from their SIM card, if available). stores (including Google Play) will not accept the publication We compute the MD5 hash of the device’s IMEI to identify of an app signed with a Debug certificate [9]. Furthermore, duplicates and updated firmware versions for a given device. we also found as many as 115 certificates that only mention in the Issuer “Android” field. A large part (43%) of those Users are required to opt in twice before initiating Lumen: certificates are supposedly issued in the US, while others traffic interception [76]. Lumen preserves its users’ privacy seem to have been issued in Taiwan (16%), China (13%), by performing flow processing and analysis on the device, and Switzerland (13%). In the absence of a public list of only sending anonymized flow metadata for research purposes. official developer certificates, it is not possible to verify their Lumen does not send back any unique identifiers, device authenticity or know their owner, as discussed in Section VI. fingerprints, or raw traffic captures. To further protect user’s privacy, Lumen also ignores all flows generated by browser With this in mind, we extracted 1,200 unique certificates out apps which may potentially deanonymize a user; and allows of our dataset. Table II shows the 5 most present companies the user to disable traffic interception at any time. in the case of phone vendors (left) and other development companies (right). This analysis uncovered a vast landscape VERVIEW O COSYSTEM III. E of third-party software in the long-tail, including large digital The openness of Android OS has enabled a complex sup- , LinkedIn, Spotify, and TripAdvisor), as well e.g. companies ( ply chain ecosystem formed by different stakeholders, be it as advertising and tracking services. This is the case of iron- manufacturers, MNOs, affiliated developers, and distributors. Source, an advertising firm signing pre-installed software [43] These actors can add proprietary apps and features to Android found in Asus, Wiko and other vendors, and TrueCaller, a devices, seeking to provide a better user experience, add value service to block unwanted call or texts [57]. According to to their products, or provide access to proprietary services. their website and also independent sources [40], [71], True- However, this could also be for (mutual) financial gain [32], Caller uses crowdsourced mechanisms to build a large dataset [14]. This section provides an overview of pre-installed An- of phone numbers used for spam and also for advertising. droid packages to uncover some of the gray areas that surround Likewise, we have found 123 apps (by their MD5) signed them, the large and diverse set of developers involved, the by Facebook. These apps are found in 939 devices, 68% of presence of third-party advertising and tracking libraries, and which are Samsung’s. We have also found apps signed by the role of each stakeholder. AccuWeather, a weather service previously found collecting personal data aggressively [87], Adups software, responsible A. Developer Ecosystem for the Adups backdoor [46], and GMobi [36], a mobile- We start our study by analyzing the organizations signing advertising company previously accused of dubious practices each pre-installed app. First, we cluster apps by the unique by the Wall Street Journal [14]. certificates used to sign them and then we rely on the informa- field of the certificate to identify tion present in the Issuer B. Third-party Services the organization [15]. Despite the fact that this is the most reliable signal to identify the organization signing the software, As in the web, mobile app developers can embed in their it is still noisy as a company can use multiple certificates, one pre-installed software third-party libraries (TPLs) provided for each organizational unit. More importantly, these are self- by other companies, including libraries (SDKs) provided by signed certificates, which significantly lowers the trust that can ad networks, analytics services or social networks. In this be put on them. section we use LibRadar++, an obfuscation-resilient tool to We were unable to identify the company behind several identify TPLs used in Android apps [91], on our dataset to certificates (denoted as in Table II) due Unknown company examine their presence due to the potential privacy implica- , e.g. to insufficient or dubious information in the certificate: tions for users: when present in pre-installed apps, TPLs have Issuer field only contains the mentions Company and the the capacity to monitor user’s activities longitudinally [90], . We have come across apps that are signed department [85]. We exclude well-known TPLs providing development ”Android Debug” by 42 different certificates on phones from support such as the Android support library. First, we classify 21 different brands. This reflects poor and potentially insecure the 11,665 unique TPLs identified by LibRadar++ according development practices as Android’s debug certificate is used to to the categories reported by Li et al. [83], AppBrain [51],

5 5 Number of Number of Certified Number of Country Country Company name Company name certificates vendors partner? certificates Google United States N/A 92 17 China 19 MediaTek Motorola Aeon 3 China 12 Yes US/China 65 Yes Asus 60 Taiwan Tinno Mobile 11 China 6 South Korea Samsung Yes 38 Verizon Wireless 10 United States 5 China Yes China 1 Huawei 7 29 Unknown company — — Total 460 — 214 Total (vendors) 740 top-5 most frequent developers (as per the total number of apps signed by them), and Left: for other companies. Table II: right: # libraries # apps # vendors Example Category Only one of the apps embedding these SDKs is signed by the actual third-party service provider, which indicates that 11,935 Advertisement 164 164 (107) Braze 158 6,935 100 (54) Mobile analytics Apptentive their presence in pre-installed apps is likely due to the app Twitter Social networks 70 (20) 6,652 157 developers’ design decisions. — All categories 334 165 25,333 C. Public and Non-public Apps Table III: Selected TPL categories present in pre-installed We crawled the Google Play Store to identify how many apps. In brackets, we report the number of TPLs when grouped of the pre-installed apps found by Firmware Scanner are by package name. available to the public. This analysis took place on the 19th of November, 2018 and we only used the package name of and PrivacyGrade [58]. We manually classified those TPLs the pre-installed apps as a parameter. We found that only that were not categorized by these datasets. 9% of the package names in our dataset are indexed in the We focus on categories that could cause harm to the users’ Google Play Store. For those indexed, few categories dominate privacy, such as mobile analytics and targeted advertisement the spectrum of pre-installed apps according to Google Play libraries. We find 334 TPLs in such categories, as summarized metadata, notably communication, entertainment, productivity, in Table III. We could identify advertising and tracking com- tools, and multimedia apps. panies such as Smaato (specialized in geo-targeted ads [64]), The low presence of pre-installed apps in the store suggests GMobi, Appnext, ironSource, Crashlytics, and Flurry. Some that this type of software might have escaped any scrutiny of these third-party providers were also found shipping their by the research community. In fact, we have found sam- own packages in Section III-A or are prominent actors across ples of pre-installed apps developed by prominent organi- apps published in Google Play Store [85]. We found 806 apps zations that are not publicly available on Google Play. For embedding Facebook’s Graph SDK which is distributed over , instance, software developed and signed by Facebook ( e.g. 748 devices. The certificates of these apps suggests that 293 com.facebook.appmanager ), Amazon, and CleanMas- of them were signed by the device vendor, and 30 by an ter among others. Likewise, we found non-publicly available operator (only 98 are signed by Facebook itself). The presence versions of popular web browsers ( , UME Browser, Opera). e.g. of Facebook’s SDKs in pre-installed apps could, in some cases, Looking at the last update information reported by An- be explained by partnerships established by Facebook with droid’s package manager for these apps, we found that pre- Android vendors as the New York Times revealed [32]. installed apps also present on Google Play are updated more We found other companies that provide mobile analytics and often than the rest of pre-installed apps: 74% of the non-public app monetization schemes such as Umeng, Fyber (previously apps do not seem to get updated and 41% of them remained Heyzap), and Kochava [85]. Moreover, we also found instances unpatched for 5 years or more. If a vulnerability exists in one of advanced analytics companies in Asus handsets such as of these applications (see Section V), the user may stay at risk Appsee [17] and Estimote [28]. According to their website, for as long as they keep using the device. Appsee is a TPL that allows developers to record and upload A NALYSIS ERMISSION IV. P the users’ screen [16], including touch events [84]. If, by itself, recording the user’s screen does not constitute a privacy leak, Android implements a permissions model to control apps’ recording and uploading this data could unintentionally leak access to sensitive data and system resources [56]. By default, private information such as account details. Estimote develops apps are not allowed to perform any protected operation. solutions for indoors geo-localization [28]. Estimote’s SDK Android permissions are not limited to those defined by AOSP: allows an app to react to nearby wireless beacons to, for any app developer – including manufacturers – can define their example, send personalized push notifications to the user upon to expose their functionality to other own custom permissions entering a shop apps [26]. We leverage Androguard [4] to extract and study Finally, we find TPLs provided by companies specialized in the permissions, both declared and requested, by pre-installed the Chinese market [91] in 548 pre-installed apps. The most apps. We primarily focus on custom permissions as i ) pre- relevant ones are Tencent’s SDK, AliPay (a payment service) installed services have privileged access to system resources, and Baidu SDK [20] (for advertising and geolocation / geo- and ii ) privileged pre-installed services may (involuntarily) coding services), the last two possibly used as replacements expose critical services and data, even bypassing Android’s for Google Pay and Maps in the Chinese market, respectively. official permission set.

6 6 A. Declared Custom Permissions been found in 24 Android vendors, including Samsung, Asus, Xiaomi, HTC, Sony, and LG. According to users’ complaints, We identify 1,795 unique Android package names across two of these packages ( com.facebook.appmanager and 108 Android vendors defining 4,845 custom permissions. com.facebook.system ) seem to automatically down- We exclude AOSP–defined permissions and those associated load other Facebook software such as Instagram in users’ with Google’s Cloud Messaging (GCM) [37]. The number of phones [69], [70]. We also found interactions between Face- custom permissions declared per Android vendor varies across book and MNOs such as Sprint. brands and models due to the actions of other stakeholders in Baidu’s geo-location permission is exposed by pre- Baidu: the supply chain. We classify the organizations declaring cus- installed apps, including core Android modules, in 7 different tom permissions in 8 groups as shown in Table IV: hardware vendors, mainly Chinese ones. This permission seems to be e.g. , Verizon), third-party services ( vendors, MNOs ( , Face- e.g. associated with Baidu’s geocoding API [19] and could allow book), AV firms ( , GSMA), , Avast), industry alliances ( e.g. e.g. app developers to circumvent Android’s location permission. e.g. , , Qualcomm), and browsers ( e.g. chipset manufacturers ( We have identified 8 custom permissions Digital Turbine: Mozilla). We could not confidently identify the organizations in 8 vendors associated with Digital Turbine and its sub- 5 responsible for 9% of all the custom permissions. sidiary LogiaGroup. Their privacy policy indicates that they As shown in Table IV, 63% of all declared custom per- collect personal data ranging from UIDs to traffic logs that missions are defined by 31 handset vendors according to our could be shared with their business partners, which are classification. Most of them are associated with proprietary undisclosed [27]. According to the SIM information of services such as Mobile Device Management (MDM) solutions these devices, Digital Turbine modules are mainly found for enterprise customers. Yet three vendors account for over in North-American and Asian users. One package name, 68% of the total custom permissions; namely Samsung (41%), com.dti.att (“dti” stands for Digital Turbine Ignite), Huawei (20%), and Sony (formerly Sony-Ericsson, 7%). Most suggests the presence of a partnership with AT&T. A manual of the custom permissions added by hardware vendors – analysis confirms that this is the case. By inspecting their along with chipset manufacturers, and MNOs – are exposed source-code, this package seems to implement comprehensive by Android core services, including the default browser software management service. Installations and removals of . Unfortunately, as demonstrated apps by users are tracked and linked with PII, which only in the MediaTek case [79], exposing such sensitive resources , hashed) discretionally. i.e. seem to be “masked” ( in critical services may potentially increase the attack surface ironSource: The advertising company ironSource exposes if not implemented carefully. custom permissions related to its AURA Enterprise Solu- An exhaustive analysis of custom permissions also suggests tions [44]. We have identified several vendor-specific packages (and in some cases confirms) the presence of service integra- exposing custom ironSource permissions, in devices made tion and commercial partnerships between handset vendors, by vendors such as Asus, Wiko, and HTC (the package MNOs, analytics services ( e.g. , Baidu, ironSource, and Digital name and certificate signatures suggest that those modules are e.g. Turbine), and online services ( , Skype, LinkedIn, Spotify, possibly introduced with vendor’s collaboration). According CleanMaster, and Dropbox). We also found custom permis- to ironSource’s material [45], AURA has access to over sions associated with vulnerable modules ( , MediaTek) and e.g. 800 million users per month, while gaining access to ad- e.g. , Adups). We discuss cases of potentially harmful services ( vanced analytics services and to pre-load software on cus- interest below. tomers’ devices. A superficial analysis of some of these pack- VPN solutions: Android provides native support to third-party , , ages ( e.g. VPN clients. This feature is considered as highly sensitive com.ironsource.appcloud.oobe.asus ) reveals that as it gives any app requesting access the capacity to break they provide vendor-specific out-of-the-box-experience apps Android’s sandboxing and monitor users’ traffic [68], [80]. (OOBE) to customize a given user’s device when the users The analysis of custom permissions reveals that Samsung and open their device for the first time and empower user engage- Meizu implement their own VPN service. It is unclear why ment [44], while also monitoring users’ activities. these proprietary VPN implementations exist but it has been Discussing every Other Advertising and Tracking Services: reported as problematic by VPN developers for whom their custom permission introduced by third-party services indi- clients, designed for Android’s default VPN service, do not vidually would require an analysis beyond the scope of this run on such handsets [1], [86], [80]. A complete analysis of paper. However, there are a couple of anecdotes of interest these VPN packages is left for future work. that we discuss next. One is the case of a pre-installed app Facebook: We found 6 different Facebook packages, three signed by Vodafone (Greece) and present in a Samsung device of them unavailable on Google Play, declaring 18 custom that exposes a custom permission associated with Exus [31], permissions as shown in Table V. These permissions have a firm specialized in credit risk management and banking solutions. Another service declaring custom permissions in 5 While Android’s documentation recommends using reverse-domain-style Samsung and LG handsets (likely sold by Verizon) is the naming for defining custom permissions to avoid collisions. [26], 269 of them – many of which are declared by a single hardware vendor – start with AOSP analytics and user engagement company Synchronoss. Its . The absence of good development prefixes such as android.permission.* privacy policy acknowledges the collection, processing and practices among developers complicated this classification, forcing us to sharing of personal data [65]. follow a semi-manual process that involved analyzing multiple signals to identify their possible purpose and for attribution. Call protection services: We identify three external com-

7 7 Providers Custom MNO permissions Ind. Alliance Browser Other Third-party Chipset Vendor AV / Security 195 (15) 3,760 (37) 29 (44) 7 (6) 549 (75) 46 (13) 192 (34) 67 (63) Total 4,845 (108) Android Modules 12 (2) 4 (13) — 6 (7) android 62 (17) 494 (21) 410 (9) — — 1 (2) — — — — 22 (8) 67 (11) 90 (15) — — 1 (1) — — — — 23 (8) 87 (16) 63 (12) — 3 (5) — — — 20 (10) 84 (14) 56 (9) 5 (2) — — — 1 (1) — 22 (8) 35 (10) 1 (2) 59 (11) — — — — — — 8 (5) 40 (7) 32 (3) — — — — — 15 (17) — 33 (10) 18 (4) Table IV: Summary of custom permissions per provider category and their presence in selected sensitive Android core modules. The value in brackets reports the number of Android vendors in which custom permissions were found. SIM Alliance Alliances Package Public # Vendors # Permissions Open Mobile Alliance Mirrorlink Linux Foundation GSMA com.facebook.system No 18 2 FIDO Alliance ANT+ com.facebook.appmanager 15 No 4 Truecaller 8 com.facebook.katana (Facebook) Yes 14 Trendmicro Symantec 5 com.facebook.orca Yes (Messenger) 5 RSupport AV/Security Qihoo360 com.facebook.lite 1 (FB Lite) 1 Yes Panda Security Mcafee No 1 4 Lookout LogMeIn Inside Secure Infraware 24 3 Total 18 Hiya BitDefender Avast AetherPal Table V: Facebook packages on pre-installed handsets. Wingtech Chipset Qualcomm NVIDIA Mediatek Intel panies providing services for blocking undesired and spam Broadcom ARM Trustzone phone calls and text messages: Hiya [38], TrueCaller [57], Vodafone Verizon n Tracfone and PrivacyStar [59]. Hiya’s solution seems to be integrated TIM T−Mobile 60 Sprint Singtel by T-Mobile (US), Orange (Spain), and AT&T (US) in their MNO S.K. Telecom 40 Orange NTT Docomo subsidized Samsung and LG handsets according to the package Mobiltel BG 20 MetroPCS Korea Telecom signatures. Hiya and TrueCaller’s privacy policies indicate that Deutsche Telekom Cricket Bouygues they collect personal data from end users, including contacts AT&T A1 Hrvatska 6 stored in the device, UIDs, and personal information [39]. Zalo Yellowpages Yandex Yahoo PrivacyStar’s privacy policy, instead, claims that any informa- WhatsApp Twitter TripAdvisor tion collected from a given user’s contacts is “NOT exported Synchronoss Spotify Skype outside the App for any purpose” [60]. Peel TV Third−parties Netflix Naver MS SwiftKey MobilesRepublic Microsoft ironSource B. Used Permissions ICE Sound Futuredial Flipboard facebook The use of permissions by pre-installed Android apps fol- Evernote Dropbox Digital Turbine lows a power-law distribution: 4,736 of the package names Cleanmaster Baidu Argus/Azumio request at least one permission and 55 apps request more Amazon fly tcl bq blu lge htc zte alps hmd wiko that 100. The fact that pre-installed apps request many per- sony asus oppo meizu a2000 xiaomi alcatel lenovo allview archos google gionee huawei oysters doogee hisense coolpad motorola samsung blackberry missions to deliver their service does not necessarily imply Handset vendor a breach of privacy for the user. However, we identified Figure 2: Permissions defined by AV firms, MNOs, chipset a significant number of potentially over-privileged vendor- vendors and third parties, requested by pre-installed apps. and MNO-specific packages with suspicious activities such – a package signed by TCLMo- com.jrdcom.Elabel as app (by its package name) across vendors, we can notice bile requesting 145 permissions and labeled as malicious by significant differences. We investigate such variations in a Hybrid Analysis (a free online malware analysis service) – subset of 150 package names present at least in 20 different and (144 permissions). Like- com.cube26.coolstore vendors. This list contains mainly core Android services as wise, the calculator app found on a Xiaomi Mi 4c requests e.g. well as apps signed by independent companies ( , Adups) user’s location and the phone state, which gives it access to and chipset manufacturers ( e.g. , Qualcomm). UIDs such as the IMEI. We discuss more instances of over- Then, we group together all the permissions requested by privileged apps in Section V-C. a given package name across all device models for each Dangerous Android permissions. The median pre-installed brand. As in the case of exposed custom permissions, we Android app requests three dangerous AOSP permissions. can see a tendency towards over-privileging these modules When we look at the set of permissions requested by a given in specific vendors. For instance, the number of permissions 6 Note: the information rendered in their privacy policy differs when crawled module can range from requested by the core android from a machine in the EU or the US. As of January 2019, none of these 9 permissions in a Google-branded Android device to over companies mention the new European GDPR directive in their privacy policies. 100 in most Samsung devices. Likewise, while the median

8 8 wit−software READ_LOGS whatsapp inc. vodafone romania MOUNT_UNMOUNT_FILESYSTEMS vodafone portugal vodafone group INSTALL_PACKAGES vlingo vision objects WRITE_SECURE_SETTINGS vire labs verizon wireless DELETE_PACKAGES uc UPDATE_DEVICE_STATS twitter tencube pte ltd. SEND_RESPOND_VIA_MESSAGE telenav inc. telecom italia BROADCAST_WAP_PUSH t−mobile synchronoss tech. BROADCAST_SMS symphony media gmbh sweetlabs MODIFY_PHONE_STATE sprint spotify REBOOT social hub slacker STATUS_BAR singtel WRITE_APN_SETTINGS qlixar n project goth inc MASTER_CLEAR peel orange ACCESS_CHECKIN_PROPERTIES opera 30 ooo yandex SET_TIME_ZONE nuance communications naranya SET_TIME 20 modula d.o.o. mobilkom austria ag BIND_APPWIDGET mobiles republic CHANGE_COMPONENT_ENABLED_STATE mobile systems 10 Signature (Org) mobile safe MOUNT_FORMAT_FILESYSTEMS maingames lbesec ACCOUNT_MANAGER ktshow Advertisement libraries ironsource ltd. CAPTURE_AUDIO_OUTPUT infraware Analytics libraries hdradio CALL_PRIVILEGED hancom Social libraries CAPTURE_VIDEO_OUTPUT future dial flipboard inc. flexilis 100 200 0 50 150 facebook inc. evernote diotek Permission usage digital jigsaw cootek cnn Figure 4: System permissions requested by pre-installed apps cequint inc. central antivirus blurb bitnpulse embedding TPLs. bambuser ab baidu inc. arcsoft htc lge zte wiko oppo are apparently requested without consent – allowing them to lenovo huawei motorola samsung blackberry cause serious damage to users’ privacy when misused by apps. sony−ericsson Handset vendor C. Permission Usage by TPLs Figure 3: Apps accessing vendors’ custom permissions. We look at the permissions used by apps embedding at least service requests 35 permissions, one TPL. We study the access to permissions with a protection this number goes over 100 for Samsung, Huawei, Advan, and or signature level of either signature|privileged LG devices. as they can only be granted to system apps [50] or those signed with a system signature. The presence of TPLs in pre-installed Custom permissions. 2,910 pre-installed apps request at least apps requesting access to a signature or dangerous permission one custom permission. The heatmap in Figure 2 shows can, therefore, give it access to very sensitive resources without the number of custom permissions requested by pre-installed user awareness and consent. Figure 4 shows the distribution of packages in a hand-picked set of popular Android manufactur- signature permissions requested across apps embedding TPLs. ers (x-axis). As we can see, the use of custom permissions also READ_LOGS – We find that the most used permissions – varies across vendors, with those associated with large third- allow the app (and thus the TPLs within it) to read system , Facebook), MNOs e.g. party analytics and tracking services ( logs, mount and unmount filesystems, or install packages. ( , Hiya) being e.g. , Vodafone), and AV/Security services ( e.g. We find no significant differences between the three types the most requested ones. of TPLs of interest. For completeness, we also find that 94 This analysis uncovers possible partnerships beyond those apps embedding TPLs of interest request custom permissions revealed in the previous sections. We identify vendor- as well. Interestingly, 53% of the 88 custom permissions used signed services accessing ironSource’s, Hiya’s, and Ac- by these apps are defined by Samsung. cuWeather’s permissions. This state of affairs potentially al- lows third-party services and developers to gain access to D. Component Exposing protected permissions requested by other pre-installed pack- ages signed with the same signature. Further, we found Custom permissions are not the only mechanism avail- Sprint-signed packages resembling that of Facebook and Face- able for app developers to expose (or access) features and and com.facebook.orca.vpl book’s Messenger APKs ( components to (or from) other apps. Android apps can also com.facebook.katana.vpl ) requesting Flurry-related interact with each other using intents , a high-level communi- permissions (a third-party tracking service owned by Verizon). cation abstraction [42]. An app may expose its component(s) to external apps by declaring android:exported=true Commercial relationships between third-party services and in the manifest without protecting the component with any vendors appear to be bi-directional as shown in Figure 3. This additional measure, or by adding one or more intent-filters to figure shows evidence of 87 apps accessing vendor permis- its declaration in the manifest; exposing it to a type of attack sions, including packages signed by Facebook, ironSource, known in the literature as a confused deputy attack [79]. If Hiya, Digital Turbine, Amazon, Verizon, Spotify, various attribute is used, it can be protected by adding the exported browser, and MNOs – grouped by developer signature for a permission to the component, be it a custom permission or clarity purposes. As the heatmap indicates, Samsung, HTC and an AOSP one, through checking the caller app’s permissions Sony are the vendors enabling most of the custom permissions programmatically in the component’s Java class. requested by over-the-top apps. We found instances of apps We sought to identify potentially careless development listed on the Play Store also requesting such permissions. practices that may lead to components getting exposed without Unfortunately, custom permissions are not shown to users any additional protection. Exporting components can lead to: when shopping for mobile apps in the store – therefore they

9 9 harmful or malicious apps launching an exposed activity, ) i Dataset. Because of scalability limitations – our dataset com- tricking users into believing that they are interacting with the prises 82,501 APK files with 6,496 unique package names – benign one; initiating and binding to unprotected services; ) ii we randomly select one APK file for each package name and and ) malicious apps gaining access to sensitive data or the iii analyze the resulting set of apps, obtaining an analysis report ability to modify the app’s internal state. for 48% of them. The majority of the remaining packages could not be analyzed due to the absence of a classes.dex We found 6,849 pre-installed apps that potentially expose for odexed files. Even though in some cases we had the corre- at least one activity in devices from 166 vendors and signed file, we generally could not deodex it since .odex sponding . For by 261 developer signatures with exported=true the device’s Android framework file was needed to complete services, 4,591 apps (present in 157 vendors) signed by this step but Firmware Scanner did not collect it. Moreover, we 183 developers including manufacturers, potentially exposed could not analyze a small subset of apps due to the limitations one or more of their services to external apps. The top-10 of our tools, including errors generated during analysis, file vendors in our dataset account for over 70% of the potentially size limitations, or analysis tools becoming unresponsive after Other relevant examples exposed activities and services. hours of processing. Instead, we focused our analysis on the include an app that potentially exposes several activities related subset of apps for which we could generate reports. to system configurations (device administration, networking, Results. We processed the analysis reports and identified the etc .), hence allowing a malicious developer could access presence of the 36 potentially privacy intrusive behaviors or or even tamper a users’ device settings. The core package potentially harmful behaviors listed in Table VI. The results found in customized firmware versions suggest that a significant fraction of the analyzed apps could also expose services to read WAP across several vendors access and disseminate both user and device identifiers, user’s messages to other apps. We also found 8 different instances location, and device current configuration. According to our of a third-party app, found in handsets built by two large flow analysis, these results give the impression that personal Android manufacturers, whose intended purpose is to provide data collection and dissemination (regardless of the purpose remote technical support to customers. This particular service or consent) is not only pervasive but also comes pre-installed. provides remote administration to MNOs, including the ability Other a priori concerning behaviors include the possible dis- to record audio and video, browse files, access system settings, semination of contacts and SMS contents (164 and 74 apps, and upload/download files. The key service to do so is exposed respectively), sending SMS (29 apps), and making phone calls and can be misused by other apps. (339 apps). Even though there are perfectly legitimate use We leave the detailed study of apps vulnerable to confused cases for these behaviors, they are also prevalent in harmful deputy attacks and the study of the access to these resources and potentially unwanted software. The distribution of the by apps publicly available on Google Play for future work. number of potentially harmful behaviors per app follows a power-law distribution. Around 25% of the analyzed apps A EHAVIORAL V. B NALYSIS present at least 5 of these behaviors, with almost 1% of We analyze the apps in our dataset to identify potentially the apps showing 20 or more. The bulk of the distribution harmful and unwanted behaviors. To do this, we leverage relates to the collection of telephony and network identifiers, both static and dynamic analysis tools to elicit behavior and interaction with the package manager, and logging activities. characterize purpose and means. This section describes our This provides a glimpse of how pervasive user and device analysis pipeline and evidence of potentially harmful and fingerprinting is nowadays. privacy-intrusive pre-installed packages. B. Traffic Analysis A. Static Analysis While static analysis can be helpful to determine a lower We triage all apps to determine the presence of potentially bound of what an app is capable of, relying on this technique harmful behaviors. This step allows us to obtain a high-level alone gives an incomplete picture of the real-world behavior overview of behaviors across the dataset and also provides us of an app. This might be due to code paths that are not with the basis to score apps and flag those potentially more available at the time of analysis, including those that are interesting. This step is critical since we could only afford to within statically- and dynamically-linked libraries that are not manually inspect a limited subset of all available apps. provided with apps, behaviors determined by server-side logic Our analysis pipeline integrates various static anal- Toolkit. (e.g., due to real-time ad-bidding), or code that is loaded at ysis tools to elicit behavior in Android apps, including runtime using Java’s reflection APIs. This limitation of static Androwarn [12], FlowDroid [74], and Amandroid [92], as approaches is generally addressed by complementing static well as a number of custom scripts based on the Apktool [13] analysis with dynamic analysis tools. However, due to various and Androguard [4] frameworks. In this stage we do not use limitations (including missing hardware features and software dynamic analysis tools, which prevents us from identifying components) it was unfeasible for us to run all the pre- hidden behaviors that rely on dynamic code uploading (DEX installed apps in our dataset in an analysis sandbox. Instead, loading) or reflection. This means that our results present a we decided to use the crowd-sourced Lumen mobile traffic lower-bound estimation of all the possible potentially harmful dataset to find evidence of dissemination of personal data from behaviors. We search for apps using DEX loading and reflec- the pre-installed apps by examining packages that exist in both tion to identify targets that deserve manual inspection. datasets.

10 10 # of domains Organization # of apps Accessed PII type / behaviors Apps (#) Apps (%) 17052 566 Alphabet 21.8 687 IMEI 322 Facebook 3325 IMSI 12 379 Amazon 201 991 Phone number 303 9.6 320 171 Verizon Communications 17.5 552 MCC 137 Twitter 101 552 MNC 17.5 Microsoft 136 408 Telephony Operator name 315 10 Adobe 116 302 identifiers SIM Serial number 181 5.7 AppsFlyer 10 98 comScore 86 8 383 SIM State 12.1 AccuWeather 15 86 Current country 194 6.2 MoatInc. 20 79 6.2 196 SIM country 79 35 Appnexus 0.9 29 Voicemail number 69 Baidu 72 Criteo 62 70 0.8 Software version 25 28 PerfectPrivacy 68 265 8.4 Phone state 40.8 Device settings Installed apps 1,286 Other ATS 221 362 11.9 375 Phone type 81.4 Logs 2,568 Table VII: Top 15 parent ATS organizations by number of 1.7 GPS 54 apps connecting to all their associated domains. 5 158 Cell location Location 5.1 162 CID Alphabet, the entity that owns and maintains the Android plat- 137 LAC 4.3 form and many of the largest advertising and tracking services 0.3 Wi-Fi configuration 9 (ATS) in the mobile ecosystem [85], also owns most of the 1,373 Current network 43.5 Network 22.2 Data plan 699 domains to which pre-installed apps connect to. Moreover, interfaces 2.3 Connection state 71 vendors who ship their devices with the Google Play Store 345 Network type 10.9 have to go through Google’s certification program which, in 164 Contacts 11 Personal data part, entails pre-loading Google’s services. Among these ser- SMS 2.31 73 vices is Google’s own SMS sending 29 0.92 package, which sends a variety of information about the user 0 SMS interception 0 Phone service and the device on which it runs to Google’s servers. Disabling SMS notif. abuse 0 0 Phone calls 339 10.7 Traffic analysis also confirms that Facebook and Twitter Audio recording 2.4 Audio/video 74 services come pre-installed on many phones and are integrated 0.7 Video capture 21 interception into various apps. Many devices also pre-install weather apps 775 Arbitrary code 24.6 Native code like AccuWeather and The Weather Channel. As reported by execution Linux commands 563 17.9 previous research efforts, these weather providers also gather 2.8 89 Remote connection Remote conn. information about the devices and their users [87], [85]. Table VI: Volume of apps accessing / reading PII or showing C. Manual Analysis: Relevant Cases potentially harmful behaviors. The percentage is referred to We used the output provided by our static and dynamic N the subset of triaged packages ( = 3 , 154 ). analysis pipeline to score apps and thus flag a reduced subset of packages to inspect manually. Our goal here was to con- Of the 3,118 pre-installed apps with Internet access Results. fidently identify potentially harmful and unwanted behavior permissions, 1,055 have at least one flow in the Lumen dataset. in pre-installed apps. Other apps were added to this set At this point, our analysis of these apps focused on two main based on the results of our third-party library and permission aspects: uncovering the ecosystem of organizations who own analysis performed in Sections III and IV, respectively. We the domains that these apps connect to, and analyzing the manually analyzed 158 apps using standard tools that include types of private information they could disseminate from user DEX disassemblers (baksmali), dex-to-java decompilers (jadx, devices. To understand the ecosystem of data collection by dex2jar), resource analysis tools (Apktool), instrumentation pre-installed apps, we studied where the data that is collected tools (Frida), and reverse engineering frameworks (radare2 and by these apps makes its first stop. We use the Fully-Qualified IDA Pro) for native code analysis. Our main findings can be Domain Names (FQDN) of the servers that are contacted and loosely grouped into three large categories: 1) known malware; use the web crawling and text mining techniques described in 2) potential personal data access and dissemination; and 3) our previous work [85] to determine the parent organization potentially harmful apps. Table VIII provides some examples who own these domains. of the type of behaviors that we found. We came across various isolated instances Known Malware. The Big Players. Table VII shows the parent organizations of known-malware in the system partition, mostly in low- who own the most popular domains contacted by pre-installed end devices but also in some high-end phones. We identified apps in the Lumen dataset. Of the 54,614 domains contacted variants of well-known Android malware families that have by apps, 7,629 belong to well-known Advertising and Track- been prevalent in the last few years, including Triada, Rootnik, ing Services (ATS) [85]. These services are represented by SnowFox, Xinyin, Ztorg, Iop, and dubious software developed organizations like Alphabet, Facebook, Verizon (now owner by GMobi. We used VirusTotal to label these samples. Accord- of Yahoo!, AOL, and Flurry), Twitter (MoPub’s parent or- ing to existing AV reports, the range of behaviors that such ganization), AppsFlyer, comScore, and others. As expected,

11 11 Family Potential Behavior and Prevalence Known Malware Disseminates PII and other sensitive data (SMS, call logs, contact data, stored pictures and videos). Downloads additional stages. Roots the Triada device to install additional apps. Rootnik [62] Gains root access to the device. Leaks PII and installs additional apps. Uses anti-analysis and anti-debugging techniques. GMobi [11], [67] Gmobi Trade Service. Leaks PII, including device serial number and MAC address, geolocation, installed packages and emails. Receives commands from servers to (1) send an SMS to a given number; (2) download and install apps; (3) visit a link; or (4) display a pop-up. It has been identified in low-end devices. Potentially Dangerous Apps # # ). #9527# Rooting app Exposes an unprotected receiver that roots the device upon receiving a telephony secret code (via intent or dialing * * * * If the device does not contain a signed file in a particular location, it loads and enforces 2 blacklists: one containing 103 packages associated Blocker with benchmarking apps, and another with 56 web domains related to phone reviews. Potential Personal Data Access and Dissemination TrueCaller Sends PII to its own servers and embedded third-party ATSes such as AppsFlyer, Twitter-owned MoPub, Crashlytics, inMobi, Facebook, and others. Uploads phone call data to at least one of its own domains. MetroName ID Disseminates PII to its own servers and also to third-party services like Piano, a media audience and engagement analytics service that tracks user’s installation of news apps and other partners including those made by CNBC, Bloomberg, TechCrunch, and The Economist, among others, the presence of which it reports to its own domains. Adups [47] FOTA app. Collects and shares private and PII with their own servers and those of embedded third-party ATS domains, including Advmob and Nexage. Found worldwide in 55 brands. Redstone’s FOTA service. Uses dynamic code uploading and reflection to deploy components located in 2 encrypted DEX files. Disseminates Stats/Meteor around 50 data items that fully characterize the hardware, the telephony service, the network, geolocation, and installed packages. Performs behavioral and performance profiling, including counts of SMS/MMS, calls logs, bytes sent and transmitted, and usage stats and performance counters on a package-basis. Silently installs packages on the device and reports what packages are installed / removed by the user. Table VIII: Examples of relevant cases and their potential behaviors found after manual analysis of a subset of apps. When referring to personal data dissemination, the term PII encompasses items enumerated in Table VI. data collected is not only remarkably extensive and multi- samples exhibit encompass banking fraud, sending SMS to dimensional, but also very far away from being anonymous premium numbers or subscribing to services, silently installing as it is linked to both user and device IDs. additional apps, visiting links, and showing ads, among others. While our method does not allow us to distinguish whether We found 612 pre-installed Potentially dangerous apps. potentially malicious apps are indeed pre-installed or took apps that potentially implement engineering- or factory-mode advantage of system vulnerabilities to install themselves in functions according to their package and app names. Such the system partition, it is important to highlight that the functions include relatively harmless tasks, such as hardware presence of pre-installed malware in Android devices has been tests, but also potentially dangerous functions such as the previously reported by various sources [66], [6], [67]. Some ability to root the device. We found instances of such apps in of the found samples use Command and Control (C2) servers which the rooting function was unprotected in their manifest still in operation at the time of this writing. i.e. ( , the component was available for every other app to Nearly Personal Data Access and Potential Dissemination. use). We also identified well-known vulnerable engineering all apps which we identified as able to access PII, appear mode apps such like MTKLogger [82]. Such apps expose to disseminate it to third-party servers. We also observed unprotected components that can be misused by other apps instances of apps with capabilities to perform hardware and co-located in the device. Other examples include a well network fingerprinting, often collected under the term “de- known manufacturer’s service, which under certain conditions vice capability,” and even analytics services that track the blacklists connections to a pre-defined list of 56 web domains installation and removal of apps (notably news apps, such (mobile device review and benchmarking websites, mostly) as those made by CNBC, Bloomberg, TechCrunch, and The and disables any installed package that matches one of a list Economist, among others). More intrusive behaviors include of 103 benchmarking apps. apps able to collect and send email and phone call metadata. IMITATIONS L TUDY VI. S The most extreme case we analyzed is a data collection service contained in a FOTA service associated with Redstone Completeness and coverage. Our dataset is not complete Sunshine Technology Co., Ltd. [61], an OTA provider that in terms of Android vendors and models, even though we “supports 550 million phone users and IoT partners in 40 cover those with a larger market share, both in the high- and countries” [22]. This app includes a service that can collect low-end parts of the spectrum. Our data collection process and disseminate dozens of data items, including both user and is also best-effort. The lack of background knowledge and device identifiers, behavioral information (counts of SMS and documentation required performing a detailed case-by-case calls sent and received, and statistics about network flows) study and a significant amount of manual inspection. In terms and usage statistics and performance information per installed of analyzed apps, determining the coverage of our study is package. Overall, this software seems to implement an analyt- difficult since we do not know the total number of pre-installed ics program that admits several monetization strategies, from apps in all shipped handsets. optimized ad targeting to providing performance feedback to Attribution. There is currently no reliable way to accurately both developers and manufacturers. We emphasize that the find the legitimate developer of a given pre-installed app by its

12 12 self-signed signature. We have found instances of certificates Previous studies on Android permis- Android permissions. field, and others with just a country code in the Issuer sions have mainly leveraged static analysis techniques to infer with strings suggesting major vendors ( e.g. , Google) signed the role of a given permission [75], [78]. These studies, how- the app, where the apps certainly were not signed by them. ever, do not cover newer versions of Android [94], or custom The same applies to package and permission names, many permissions. In [81], Jiang demonstrated how custom et al. of which are opaque and not named following best-practices. permissions are used to expose and protect services. Our work Likewise, the lack of documentation regarding custom permis- complements this study by showing how device makers and sions prevented us from automatizing our analysis. Moreover, third parties alike declare and use custom permissions, and a deeper study of this issue would require checking whether make the first step towards a complete and in-depth analysis those permissions are granted in runtime, tracing the code of the whole custom permissions’ landscape. to fully identify their purpose, and finding whether they are Vulnerabilities in pre-installed apps. A recent paper by actually used by other apps in the wild, and at scale. Wu et al. [93] also used crowdsourcing mechanisms to detect apps that listen to a given TCP or UDP port and analyze the Package Manager. packages.xml We do not collect the vulnerabilities that are caused by this practice. While their file from our users’ devices as it contains information about study is not limited to user-installed apps, they show evidence all installed packages, and not just pre-installed ones. We of pre-installed apps exhibiting this behavior. consider that collecting this file would be invasive. This, however, limits our ability to see if user-installed apps are VIII. D C ISCUSSION AND ONCLUSIONS using services exposed by pre-installed apps via intents or This paper studied, at scale, the vast and unexplored ecosys- custom permissions. We tried to compensate for that with a tem of pre-installed Android software and its potential impact manual search for public apps that use pre-installed custom on consumers. This study has made clear that, thanks in large permissions, as discussed in Section IV-D. part to the open-source nature of the Android platform and the Our study mainly relies on static anal- Behavioral coverage. complexity of its supply chain, organizations of various kinds ysis of the samples harvested through Firmware Scanner, and and sizes have the ability to embed their software in custom we only applied dynamic analysis to a selected subset of 1,055 Android firmware versions. As we demonstrated in this paper, packages. This prevents us from eliciting behaviors that are this situation has become a peril to users’ privacy and even only available at runtime because of the use of code loading security due to an abuse of privilege or as a result of poor and reflection, and also code downloading from third-party software engineering practices that introduce vulnerabilities servers. Despite this, our analysis pipeline served to identify and dangerous backdoors. a considerable amount of potentially harmful behaviors. A The Supply Chain. The myriad of actors involved in the deeper and broader analysis would possibly uncover more development of pre-installed software and the supply chain cases. range from hardware manufacturers to MNOs and third-party There is no sure way of knowing Identifying rooted devices. advertising and tracking services. These actors have privileged whether a device is rooted or not. While our conservative access to system resources through their presence in pre- approach limits the number of false negatives, we have found installed apps but also as third-party libraries embedded in occurrences of devices with well-known custom ROMs that them. Potential partnerships and deals – made behind closed were not flagged as rooted by RootBeer. Moreover, we have doors between stakeholders – may have made user data a found some apps that allow a third party to root the device on- commodity before users purchase their devices or decide to the-fly to, for example, install new apps on the system partition install software of their own. as discussed in Section V-C. Some of these apps can then un- Unfortunately, due to a lack of central authority Attribution. root the phone to avoid detection. Under the presence of such or trust system to allow verification and attribution of the self- an app on a device, we cannot know for sure if a given package signed certificates that are used to sign apps, and due to a lack – particularly a potentially malicious app – was pre-installed of any mechanism to identify the purpose and legitimacy of by an actor in the supply chain, or was installed afterwards. many of these apps and custom permissions, it is difficult to attribute unwanted and harmful app behaviors to the party or VII. R ELATED WORK parties responsible. This has broader negative implications for accountability and liability in this ecosystem as a whole. Previous work has been Android images customization. focused on studying modifications made to AOSP images, The Role of Users and Informed Consent. In the meantime whether by adding root certificates [89], customizing the regular Android users are, by and large, unaware of the et default apps [73], or the OS itself [95]. In [72], Aafer presence of most of the software that comes pre-installed on introduced a new class of vulnerability caused by the al. their Android devices and their associated privacy risks. Users firmware customization process. If an app is removed but are clueless about the various data-sharing relationships and a reference to it remains in the OS, a malicious app could partnerships that exist between companies that have a hand in potentially impersonate it which could lead to privacy and deciding what comes pre-installed on their phones. Users’ ac- security issues. While these studies have focused on Android tivities, personal data, and habits may be constantly monitored images as a whole rather than pre-installed apps, they all show by stakeholders that many users may have never heard of, let the complexity of the Android ecosystem and underline the alone consented to collect their data. We have demonstrated lack of control over the supply chain. instances of devices being backdoored by companies with

13 13 the ability to root and remotely control devices without user possible to build a certificate transparency repository dedicated awareness, and install apps through targeted monetization and to providing details and attribution for self-signed certificates user-acquisition campaigns. Even if users decide to stop or used to sign various Android apps, including pre-installed delete some of these apps, they will not be able to do so since ones. many of them are core Android services and others cannot be Accessible documentation and consent forms: Similar to • permanently removed by the user without root privileges. It is the manner in which open-source components of Android unclear if the users have actually consented to these practices, require any modified version of the code to be made publicly- or if they were informed about them before using the devices available, Android devices can be required to document the , on first boot) in the first place. To clarify this, we acquired i.e. ( specific set of apps that have pre-installed, along with their 6 popular brand-new Android devices from vendors including purpose and the entity responsible for each piece of software, Nokia, Sony, LG, and Huawei from a large Spanish retailer. in a manner that is accessible and understandable to the users. When booting them, 3 devices did not present a privacy policy This will ensure that at least a reference point exists for at all, only the Android terms of service. The rest rendered users (and regulators) to find accurate information about pre- a privacy policy that only mentions that they collect data installed apps and their practices. Moreover, the results of our about the user, including PII such as the IMEI for added small-scale survey of consent forms of some Android vendors value services. Note that users have no choice but to accept leaves a lot to be desired from a transparency perspective: Android’s terms of service, as well as the manufacturer’s one users are not clearly informed about third-party software that if presented to the user. Otherwise Android will simply stop is installed on their devices, including embedded third-party booting, which will effectively make the device unusable. tracking and advertising services, the types of data they collect While some jurisdictions Consumer Protection Regulations. from them by default, and the partnerships that allow personal have very few regulations governing online tracking and data data to be shared over the Internet. This necessitates a new collection, there have been a number of movements to regulate form of privacy policy suitable for pre-installed apps to be and control these practices, such as the GDPR in the EU [29], defined (and enforced) to ensure that such practices are at and California’s CCPA [21] in the US. While these efforts least communicated to the user in a clear and accessible way. are certainly helpful in regulating the rampant invasion of This should be accompanied by mechanisms to enable users users’ privacy in the mobile world, they have a long way to make informed decisions about how or whether to use such to go. Most mobile devices still lack a clear and meaningful devices without having to root them. mechanism to obtain informed consent, which is a potential Final Remarks. Despite a full year of efforts, we were only violation of the GDPR. In fact, it is possible that many of the able to scratch the surface of a much larger problem. This ATSes that come pre-installed on Android devices may not be work is therefore exploratory, and we hope it will bring more COPPA-compliant [88] – a US federal rule to protect minors attention to the Android supply chain ecosystem and its impact from unlawful online tracking [23] –, despite the fact that on users’ privacy and security. We have discussed our results many minors in the US use mobile devices with pre-installed with Google which gave us useful feedback. Our work was software that engage in data collection. This indicates that even also the basis of a report produced by the Spanish Data in jurisdictions with strict privacy and consumer protection Protection Agency (AEPD) [3]. We will also improve the laws, there still remains a large gap between what is done capabilities and features of both Firmware Scanner and Lumen in practice and the enforcement capabilities of the agencies to address some of the aforementioned limitations and develop appointed to uphold the law. methods to perform dynamic analysis of pre-installed software. To address the issues mentioned above Recommendations. Given the scale of the ecosystem and the need for manual and to make the ecosystem more transparent we propose inspections, we will gradually make our dataset (which keeps a number of recommendations. which are made under the growing at the time of this writing) available to the research assumption that stakeholders are willing to self-regulate and community and regulators to aid in future investigations and to enhance the status quo. We are aware that some of these to encourage more research in this area. suggestions may inevitably not align with corporate interests CKNOWLEDGMENTS A of every organizations in the supply chain, and that an inde- pendent third party may be needed to audit the process. Google We are deeply grateful to our Firmware Scanner users for might be a prime candidate for it given its capacity for licens- enabling this study, and ElevenPaths for their initial support ing vendors and its certification programs. Alternatively, in in this project. We thank the anonymous reviewers for their absence of self-regulation, governments and regulatory bodies helpful feedback. This project is partially funded by the could step in and enact regulations and execute enforcement US National Science Foundation (grant CNS-1564329), the actions that wrest back some of the control from the various European Union’s Horizon 2020 Innovation Action program actors in the supply chain. We also propose a number of (grant Agreement No. 786741, SMOOTH Project), the Spanish actions that would help independent investigators to detect Ministry of Science, Innovation and Universities (grants Dis- deceptive and potentially harmful behaviors. coEdge TIN2017-88749-R and SMOG-DEV TIN2016-79095- Attribution and accountability: • To combat the difficulty in C2-2-R), and the Comunidad de Madrid (grant EdgeData- attribution and the resulting lack of accountability, we propose CM P2018/TCS-4499). Any opinions, findings, conclusions, the introduction and use of certificates that are signed by or recommendations expressed in this paper are those of the globally-trusted certificate authorities. Alternatively, it may be authors and do not reflect the views of the funding bodies.

14 14 Under EFERENCES Investigation. Criminal R Are Deals Data [33] Facebook’s [1] AdGuard - Meizu Incompatibilities. investigation.html. [Online; accessed 31-March-2019]. AdguardForAndroid/issues/800. [Online; accessed 31-March-2019]. [34] Firmware Scanner. [2] Amazon suspends sales of Blu phones for including preloaded spy- imdea.networks.iag.preinstalleduploader. [Online; accessed 31-March- ware, again. 2019]. blu-suspended-android-spyware-user-data-theft. [Online; accessed 31- by market held [35] Global leading smartphone vendors. share March-2019]. ́ [3] An alisis del software preinstalado en dispositivos Android y riesgos para [Online; accessed by-smartphone-vendors-since-4th-quarter-2009/. la privacidad de los usuarios. 31-March-2019]. html. [Online; accessed 31-March-2019]. [36] GMobi — General Mobile Corporation. [4] Androguard. [Online; ac- en/. [Online; accessed 31-March-2019]. cessed 31-March-2019]. [37] Google Cloud Messaging. [5] Android — Certified. [Online; messaging/android/android-migrate-fcm. [Online; accessed 31-March- accessed 31-March-2019]. 2019]. [6] Android Adware and Ransomware Found Preinstalled on High- [38] Hiya. [Online; accessed 31-March-2019]. End Smartphones. [39] Hiya Partners. [Online; accessed 31- android-adware-and-ransomware-found-preinstalled-on-high-end- March-2019]. smartphones/. [Online; accessed 31-March-2019]. [40] How does Truecaller get its data? [7] Android Certified Partners. us/articles/212638485-How-does-Truecaller-get-its-data. [Online; ac- [Online; accessed 31-March-2019]. cessed 31-March-2019]. [8] Android Compatibility Program Overview. [41] Infinum Inc. [Online; accessed 31-March-2019]. compatibility/overview. [Online; accessed 31-March-2019]. [42] Intents and Intent Filters - Android Developers. https://developer. [9] Android Developer Documentation. [On- [Online; accessed 31- line; accessed 31-March-2019]. March-2019]. trackers/. [On- [10] Android Trackers. [43] IronSource — App monetization done right. line; accessed 31-March-2019]. [Online; accessed 31-March-2019]. [11] Android.Gmobi.1. is=1&i=7999623& [44] IronSource - AURA. lng=en. [Online; accessed 31-March-2019]. [Online; accessed 31-March-2019]. [12] Androwarn–Yet another static code analyzer for malicious Android [45] IronSource - Aura for Advertisers. applications. [Online; accessed ironSource/aura-for-advertisers. [Online; accessed 31-March-2019]. 31-March-2019]. [46] Kryptowire Discovers Mobile Phone Firmware that Transmitted Person- https: [13] Apktool–A tool for reverse engineering Android apk files. ally Identifiable Information (PII) without User Consent or Disclosure. // [Online; accessed 31-March-2019]. analysis.html. [Online; ac- security [14] App Traps: How Cheap Smartphones Siphon User Data in Devel- cessed 31-March-2019]. opingmCountries. [47] Kryptowire Provides Technical Details on Black Hat 2017 Presentation: smartphones-help-themselves-to-user-data-1530788404. [Online; ac- Observed ADUPS Data Collection & Data Transmission. https://www. cessed 31-March-2019]. behavior.html. [On- collection data adups [15] Application signing. line; accessed 31-March-2019]. signing. [Online; accessed 31-March-2019]. [48] locationlabs by Avast. [Online; accessed [16] Appsee — Features. [Online; ac- 31-March-2019]. cessed 31-March-2019]. [49] Lumen Privacy Monitor. [17] Appsee Mobile App Analytics. [Online; edu.berkeley.icsi.haystack. [Online; accessed 31-March-2019]. accessed 31-March-2019]. [50] Manifest permissions. [18] Asurion. [Online; accessed 31-March-2019]. Manifest.permission. [Online; accessed 31-March-2019]. [19] Baidu Geocoding API. [51] Monetize, advertise and analyze Android apps. https://www.appbrain. htm. [Online; accessed 31-March-2019]. com. [Online; accessed 31-March-2019]. [20] Baidu SDK. [Online; accessed 31-March- [52] OnePlus Device Root Exploit: Backdoor in EngineerMode App for Di- 2019]. agnostics Mode. [21] California Consumer Privacy Act. device-root-exploit-backdoor-engineermode-app-diagnostics-mode/. id=201720180AB375. [Online; accessed faces/billTextClient.xhtml?bill [Online; accessed 31-March-2019]. 31-March-2019]. [53] OnePlus left a backdoor in its devices capable of root ac- [22] China Mobile Network Partner Redstone Moves into Robotics. https: cess. // [Online; ac- devices-capable-root-access/. [Online; accessed 31-March-2019]. cessed 31-March-2019]. [54] OnePlus OxygenOS built-in analytics. [23] COPPA - Children’s Online Privacy Protection Act. post/oneplus-analytics/. [Online; accessed 31-March-2019]. [Online; accessed 31-March-2019]. [55] OnePlus Secret Backdoor. [24] CVE-2017-2709. backdoor/. [Online; accessed 31-March-2019]. oneplus CVE-2017-2709. [Online; accessed 31-March-2019]. [56] Permissions overview. [25] CVE-2017-2709. permissions/overview.html. [Online; accessed 31-March-2019]. 2015-0864. [Online; accessed 31-March-2019]. [57] Phone Number Search — TrueCaller. [26] Define a Custom Permission. [Online; accessed 31-March-2019]. topics/permissions/defining. [Online; accessed 31-March-2019]. [58] Privacy Grade. [Online; accessed 31-March- [27] Digital Turbine - Privacy Policy. 2019]. privacy-policy/. [Online; accessed 31-March-2019]. [59] PrivacyStar. [Online; accessed 31-March-2019]. [28] Estimote — indoor location with bluetooth beacons and mesh. https: [60] PrivacyStar Privacy Policy. [On- // [Online; accessed 31-March-2019]. line; accessed 31-March-2019]. [29] EU General Data Protection Regulation (GDPR). [61] Redstone. [Online; accessed 31-March- [Online; accessed 31-March-2019]. 2019]. https: [30] Europe should be wary of Huawei, EU tech official says. [62] Rootnik Android Trojan Abuses Commercial Rooting Tool and Steals // [On- Private Information. line; accessed 31-March-2019]. android-trojan-abuses-commercial-rooting-tool-and-steals-private- [31] EXUS. [Online; accessed 31-March-2019]. information/. [Online; accessed 31-March-2019]. [32] Facebook Gave Device Makers Deep Access to Data on Users and [63] Simple to use root checking Android library. Friends. rootbeer. [Online; accessed 31-March-2019]. facebook-device-partners-users-friends-data.html. [Online; accessed 31- March-2019].

15 15 [81] J IANG , Y. Z. X., AND X UXIAN , Z. Detecting Passive Content Leaks [64] Smaato Blog. And Pollution In Android Applications. In Proceedings of the Network about-location-based-mobile-advertising. [Online; accessed 31-March- and Distributed System Security Symposium (NDSS) (2013). 2019]. B , R YAN AND S [65] Synchronoss Technologies - Privacy Policy. , A [82] J NGELOS AND OHNSON - ZZE , A ENAMEUR TAVROU All Your SMS & Contacts Belong to ADUPS & Oth- . DINE privacy-policy/#datacollected. [Online; accessed 31-March-2019]. ers. [66] Triada Trojan Found in Firmware of Low-Cost Android Smartphones. All-Your-SMS-&-Contacts-Belong-To-Adups-&-Others.pdf. [Online; accessed 31-March-2019]. ransomware-found-preinstalled-on-high-end-smartphones/. [Online; ́ accessed 31-March-2019]. ISSYAND , L., B , Y. An investi- [83] L RAON T E L AND , J., LEIN , T. F., K E I gation into the use of common libraries in android apps. In Proceedings [67] Upstream - Low-end Android smartphones sold with pre-installed ma- of the International Conference on Software Analysis, Evolution, and licious software in emerging markets. https://www.upstreamsystems. Reengineering (SANER) (2016). com/pre-installed-malware-android-smartphones/. [Online; accessed 31- EN , E., R AN [84] P March-2019]. , D. HOFFNES C AND , C., ILSON , M., W INDORFER , J., L Panoptispy: Characterizing Audio and Video Exfiltration from Android [68] VPN Service. Proceedings of the Privacy Enhancing Technologies Applications. VpnService. [Online; accessed 31-March-2019]. Symposium (PETS) 2018 . [69] What is “com,facebook,app manager” and why is it trying to download AZAGHPANAH , N., ODRIGUEZ -R ALLINA , R., V ITHYANAND , A., N [85] R Instagram, Facebook, and Messenger. https://forums.androidcentral. ILL AND , C., REIBICH , M., K LLMAN , S., A G UNDARESAN S , P. Apps, com/android-apps/547447-what-com-facebook-app-manager-why- Trackers, Privacy, and Regulators: A Global Study of the Mobile trying-download-instagram-facebook-messenge.html. [Online; accessed Tracking Ecosystem. In Proceedings of the Network and Distributed 31-March-2019]. System Security Symposium (NDSS) (2018). [70] XDA-Developers Forum (Galaxy Note 4). com.facebook.appmanager. UNDARESAN AZAGHPANAH [86] R , A., V ALLINA -R ODRIGUEZ , N., S , S., P K REIBICH , C., G ILL , P., A LLMAN , M., AND AXSON , V. Haystack: appmanager-t2919151. [Online; accessed 31-March-2019]. In situ mobile traffic analysis in user space. arXiv preprint [71] Your Data Is Our Data: A Truecaller Breakdown. arXiv:1510.01419 (2015). 2018/05/02/your-data-is-our-data-a-truecaller-breakdown/. [Online; ac- , J., L INDORFER AO UBOIS , D. J., R , D., HOFFNES , A., C [87] R EN , M., D cessed 31-March-2019]. ODRIGUEZ , N. Bug Fixes, Improvements,... and AND V ALLINA -R HANG HEN [72] A AFER , Y., Z HANG , N., Z ANG , Z., Z HANG , X., C , K., W , Proceedings of the Network and Distributed System Privacy Leaks. AND , W., X., Z HOU , X., D U G , M. Hare Hunting In The Wild RACE (2018). Security Symposium (NDSS) Android: A Study On The Threat Of Hanging Attribute References. In AZAGH , I., W EYES [88] R EARDON , P., R IJESEKERA , J., O - N , A. E. B., R Proceedings of the ACM Conference on Computer and Communication ALLINA , S. ”Won’t GELMAN E AND , N., ODRIGUEZ -R , A., V PANAH (2015). Security (CCS) Somebody Think of the Children?” Examining COPPA Compliance at AFER [73] A , Y., Z HANG , X., AND U D , W. Harvesting Inconsistent Security Proceedings of the Privacy Enhancing Technologies Symposium Scale. Configurations In Custom Android ROMs Via Differential Analysis. In (PETS) (2018). (2016). Proceedings of the USENIX Security Symposium ODRIGUEZ , N., [89] V -R ALLINA , N., A MANN , J., K REIBICH , C., W EAVER , S., R RITZ , C., B RZT [74] A , S., F ODDEN , E., B , A., ASTHOFER ARTEL , V. A Tangled Mass: The Android Root Certificate Stores. AND P AXSON LEIN K E T RAON , Y., O CTEAU , D., , J., L AND M C D ANIEL , P. Flow- Proceedings of the International Conference on Emerging Networking In droid: Precise context, flow, field, object-sensitive and lifecycle-aware (2014). EXperiments and Technologies (CoNEXT) taint analysis for android apps. Proceedings of the ACM Special Interest , J., F -R ALLINA [90] V , A., G - RUNEN INAMORE HAH , N., S ODRIGUEZ Group on Programming Languages (SIGPLAN) (2014). BERGER ROWCROFT C AND ADDADI , K., H APAGIANNAKI , Y., P , H., , , Z., U , K. W. Y., Z HOU , Y. F., H UANG [75] A AND L IE , D. PScout: J. Breaking for commercials: characterizing mobile advertising. In Proceedings of Analyzing The Android Permission Specification. In Proceedings of the Internet Measurement Conference (IMC) (2012). the ACM Conference on Computer and Communication Security (CCS) IU ALLINA , J., V IANG , Y., , H., L UO , N., G , Z., L ODRIGUEZ -R [91] W ANG (2012). , G. Beyond Google Play: A U X AO , J., C APIADOR , L., T I L , J., AND , E. ITTRICH , D., AND K ENNEALLY [76] D The Menlo Report: Ethical Large-Scale Comparative Study of Chinese Android App Markets. In principles guiding information and communication technology research. Proceedings of the Internet Measurement Conference (IMC) (2018). US Department of Homeland Security (2012). [92] W . Amandroid: A Precise and OBBY R AND , X., U , S., O , F., R EI OY [77] D . Trojan preinstalled on Android devices infects applications’ EB W R General Inter-component Data Flow Analysis Framework for Security processes and downloads malicious modules. Proceedings of the ACM Conference on Vetting of Android Apps. In news/?i=11390&lng=en. [Online; accessed 31-March-2019]. Computer and Communication Security (CCS) (2014). , S., S AGNER W AND , D., ONG , D. An- ANNA , E., H HIN , A. P., C ELT [78] F HANG , D., C AO , D., G U E , E. K. T., , HENG , E., C , R. K. C., H [93] W droid Permissions Demystified. In Proceedings of the ACM Conference AND D ENG , R. H. Understanding Open Ports In Android Applications: (2011). on Computer and Communication Security (CCS) Proceedings of the Discovery, Diagnosis, And Security Assessment. HIN , OSHCHUK ANG ELT [79] F , A., H ANNA , S., AND , H. J., M , A. P., W C Network and Distributed System Security Symposium (NDSS) (2019). E. Permission Re-Delegation: Attacks And Defenses. In Proceedings , O. Small Changes, Big ADYATSKAYA G AND , Y., [94] Z HAUNIAROVICH (2011). of the USENIX Security Symposium Changes: An Updated View On The Android Permission System. In , N., S ENEVIRATNE , S., K AAFAR , [80] I KRAM , M., V ALLINA -R ODRIGUEZ Research in Attacks, Intrusions, and Defenses (2016). P AND M. A., , V. An analysis of the privacy and security risks AXSON EE HOU , X., L [95] Z , Y., Z HANG , N., N AVEED , M., AND W ANG , X. The of android vpn permission-enabled apps. In Proceedings of the Internet Peril Of Fragmentation: Security Hazards In Android Device Driver Measurement Conference (IMC) (2016). Customizations. In IEEE Symposium on Security and Privacy (SP) (2014).

16 16 Vendors Vendor’s Country PPENDIX A Samples (N=130) share Total Unique A. Userbase distribution USA 12% 36 11 17% Spain 24 6% 3 11% Table IX describes our userbase geographical distribution. Indonesia 26 7 12% 6% 5% 15 6 7% Italy UK 4% 19 6 9% Mexico 3% 3 8% 17 3% 28 Thailand 13% 12 Germany 3% 21 2 10% Belgium 2% 17 4 8% Netherlands 16 2 8% 2% Total countries 130 — 214 Table IX: Geographical distribution of our users. Only the top 10 countries are shown.

17 17 B. Custom permissions Table X reports a subset of custom permissions defined by device vendors, MNOs, third-party services, and chipset manufacturers. MANUFACTURER PERMISSIONS Vendor(s) Developer Signature Permission Package name com.sonyericsson.permission.FACEBOOK com.sonyericsson.facebook.proxylogin Sony Ericsson (SE) Sony Sony Ericsson (SE) Sony com.sonymobile.permission.TWITTER com.sonymobile.twitter.account com.sonymobile.googleanalyticsproxy.permission.GOOGLE android Sony Ericsson (SE) Sony ANALYTICS USE HTC *.permission.SYSTEM Android (TW) Sony com.sonymobile.permission.READ GMAIL com.sonymobile.gmailreaderservice Sony Ericsson (SE) Samsung Corporation (KR) *.ap.accuweather.ACCUWEATHER DAEMON ACCESS PROVIDER Samsung MDM Lenovo (CN) android.permission.LENOVO android Lenovo com.asus.loguploaderproxy AsusTek (TW) Asus asus.permission.MOVELOGS Xiaomi (CN) com.miui.core miui.permission.DUMP CACHED LOG Xiaomi GENERIC VPN com.sec.enterprise.knox.KNOX android Samsung Samsung (KR) Samsung (KR) Samsung ENTERPRISE VPN SOLUTION com.sec.enterprise.permissions android.permission.sec.MDM Meizu com.meizu.permission.CONTROL VPN Meizu (CN) MNO PERMISSIONS Developer Signature MNO Permission Package name ZTE T-Mobile US com.tmobile.comm.RECEIVE METRICS LG com.tmobile.comm.RECEIVE METRICS com.lge.ipservice T-Mobile US ADM MESSAGE hr.infinum.mojvip Infinum (HR) [41] H1 Croatia hr.infinum.mojvip.permission.RECEIVE com.locationlabs.cni.att AT&T (US) AT&T (US) [48] com.locationlabs.cni.att.permission.BROADCAST Asurion (US) [18] Verizon (US) MESSAGE South Korea Telekom com.skt.aom.permission.AOM RECEIVE Naver (JP) THIRD-PARTY SERVICE PERMISSIONS Package name Permission Developer Signature Provider Facebook Facebook *.ACCESS com.facebook.system Amazon Amazon SDK Huawei (CN) Baidu android.permission.BAIDU SERVICE LOCATION com.oppo.findmyphone Oppo (CN) android.permission.BAIDU LOCATION SERVICE Baidu Logia com.digitalturbine.ignite.ACCESS LOG com.dti.sliide Digital Turbine Logia Digital Turbine com.dti.att EVENTS com.dti.att.permission.APP com.ironsource.appcloud.oobe.wiko ironSource ironSource com.ironsource.aura.permission.C2D MESSAGE com.vcast.mediamanager Verizon (US) Synchronoss PERMISSION Vodafone (GR) Exus MESSAGE com.trendmicro.freetmms.gmobi GMobi com.trendmicro.androidmup.ACCESS TMMSMU REMOTE SERVICE TrendMicro (TW) Skype CONTACTS Skype (GB) com.cleanmaster.sdk Samsung (KR) CleanMaster com.cleanmaster.permission.sdk.clean Netflix (US) Netflix *.permission.CHANNEL ID CHIPSET PERMISSIONS Developer Signature Provider Permission Package name com.qualcomm.location ZTE (CN) Qualcomm com.qualcomm.permission.IZAT com.mediatek.mtklogger TCL (CN) MediaTek com.permission.MTKLOGGER Samsung (KR) Broadcom broadcom.permission.BLUETOOTH MAP Table X: Custom permission examples. The wildcard * represents the package name whenever the permission prefix and the package name overlap.

Related documents