Mobile devices that run Android operating system are widely used. The applications running in Android mobiles can have malicious permissions due to malware. In other words, Android applications might spread malware which can sabotage valuable data. Therefore it is essential to have mechanism to classify malware and benign mobile applications running in Android phones. Since Android mobile applications run in the confines of mobile devices and associated servers, it is very challenging task to detect Android malware. Many solutions came into existence to detect malware applications. Of late Abawajy et al. proposed a technique known as Iterative Classifier Fusion System (ICFS) which employs classifiers iteratively with fusion to generate a final classifier for effective detection of malware. They combined NBtree classifier, Multilayer perception and LibSVM with polynomial kernel to achieve this. However, the system does not focus on reduction or pruning of Android application permissions so as to build a classifier that reduces time and space complexity. In the proposed system, a methodology is proposed that focuses on reduction or pruning of Android application permissions and ranking them in order to build a classifier that reduces time and space complexity. The classifier modelled with best ranked permissions can be representative of all permissions as least significant permissions are pruned to reduce search space. We built a prototype application to demonstrate proof of the concept. The experimental results revealed that the proposed system performs better in improving detection accuracy besides precision and recall measures.

Index Terms– Malware, malware detection technique, pruning, ranking
Android malware has become a potential risk to mobile applications. Due to the increase in the usage of Android mobile applications in the world, the attackers target it to spread malware. Malware is the malicious software that can cause damage to a system of mobile device in terms of removing data or denying a service and so on. Malware can be one of the forms of cyber security threats. It has history of damaging potential applications in the real world. The cyber security threat landscape is increased drastically with the presence of Android malware. The rationale behind this is that people of all walks of life started using Android smart phones for virtually any operation including banking and shopping.
As the mobile smart phones and associated sensors produce huge amount of data known as big data, it became crucial to protect Android applications from malware. Big data has become an important buzzword and there are enterprises that depend on the business intelligence acquired from big data for decision making. In this context, it is important to have an efficient mechanism to detect and prevent Android malware. Many approaches came into exultance as found in the literature. They include signature based approaches 1, commercial malware detection methods 2, detection methods that also support Dalvik byte code transformations 4, data flow path based solutions 5 and 6, characterization of malware through system calls 9 and a hybrid approach that combines both API-calls and permission based approaches. In 12 an iterative approach is followed with multiple classifiers to detect malware. From the literature it is found that the methods are focusing more on accuracy of the solution rather than the computational complexity. In this paper we proposed an approach that focuses more on reducing computational complexity and increasing accuracy of the detection method. Our contributions are as follows.
• We proposed a framework to have a systematic approach in Android malware detection. It is based on the significant permission based permission reduction, pruning and ranking approaches.
• We proposed an algorithm known as Permission Significance-based Pruning for Android Malware Detection (PSP-AMD) to build a classifier that provides accuracy of detection and reduces computational complexity.
• We built a prototype application that demonstrates proof of the concept. The empirical results revealed the utility of the proposed solution which is light weight and focuses not only on the accuracy but also reduction of computational complexity.
The remainder of the paper is structured as follows. Section 2 provides review of literature. Section 3 presents the proposed system in detail. Section 4 presents experimental results while section 5 concludes the paper and provides future scope.
This section provides review of literature on the malware detection methods and related works. As the Android mobile platform became popular, adversaries are targeting spreading of malware through Android mobile apps. There is a good survey on the current methods to detect malware in Android applications is found in 1. There are signature based methods that are used to make use of malware signatures for detection. Signature based approaches are more prevalent among solutions available. Zhou et al. 2 studied commercial malware detection systems that are popular. Their studies revealed the fact that the detection rate of the method is between 20.2% and 78.6%. Similar kind of work is made in 3 where experimental results are provided for man popular anti-malware approaches associated with cloud. For many modern computers, the previous solutions were found inadequate. The work is to know whether the current anti-malware detection methods can handle Dalvik byte code transformations. Their experiments proved that there was further research needed to define methods to handle obfuscation. In 4 an advanced detection method that is behavior-based is presented. It could prevent the vulnerability known as system-call injection. Asymptotic equi-partition property is used by their method in order to extract important call sequences to detect malware.
A framework for automated analysis for detection of malware in Android applications is proposed. The framework identified malicious behaviours automatically by simulating intent broadcasts and user-interface events. Both static and dynamic analyses are combined in 5 while 6 make use of data flow path to distinguish benign apps from malicious ones. They made experiments on a large dataset and found that there was classification accuracy of 96% with benign apps and 98% with malware apps. In 7 a new approach is proposed to make use of system call in order to detect malware by characterizing malware behavior. In 8 a static approach is proposed based on the API-call based and permission-based approaches. It uses a multi-classifier system and follows a collaborative approach based on probability theory that combines decisions of multiple classifiers. There are many approaches that exist in the literature. There are ensemble-classifiers that utilize multiple approaches. In 7 pruning ensemble classifiers are studied.
A multi-level system is proposed for detection of Android malware while focuses on an iterative multi-tier ensemble classifiers to do the same. In 10 another multi-classifier system is built with high accuracy. These solutions have used multi-classifier systems to increase accuracy in the detection of malware when compared with the solutions that used single-classifiers. The problem with these systems is that they are very huge and cannot be directly used for smart phone applications. They need more processing power and storage capacities besides causing much communication overhead as explored in 9. These approaches focused on detection accuracy but they did not consider computational cost. There are some features of malware that are generally used to characterise them. In 11 multiple features are used to detect malware. They used feature selection algorithms to do so. There are some limitations in the feature selection methods too as they give emphasis on the algorithm that is specific. In 12 an iterative classifier fusion method is employed where multiple classifiers are involved. It is found to be complex and it can be optimized further. In this paper we proposed a light weight approach known as permission reduction, pruning and ranking based classifier for efficient detection of Android malware.