18 November 2009

Utilize Google Search To Search Keyword From Content of a Website

Sometimes you may have favorite blogs having good posted content and you want to search something that may be useful or needed for you. Almost all of the blog powered by some blog engines such as wordpress, drupal and any other blog engine have a search box feature to enable visitor to search the content inside. However you can't rely on this search since most of the result appeared is not what you expected execept that blog affiliates google custom search like mashable.com have. Why the content results is not what you expected ? this caused by the weakness of relevance, synonyms and other common search engine features and fyi this search depends heavily on simple database query utilizing "LIKE %%' syntax.

Google Search can help you to perform a search deeply to specific website you want with the power of google in their features like page rank, relevance, synonim, more targeted content etc.

In this example, i use uxbooth.com site since this site is so many talks about Usability then we test their Search Usability :-p.


I would like to search with "long tail" and "usability" keyword. See what happened in their search result and try to click top 5 links from the result and check whether the search term you type inside the content.












Let's compare those with the google result where previously i typed site:http://www.uxbooth.com usability and site:http://www.uxbooth.com long tail into google search box.














You see rite ? Google has better result and targeted whether you try to click the link result their should be information about the search term you typed in.

Conclusion

Thank to google, they have better result that mean less time needed to search useful information inside specific website. Don't forget to put "site:url" and followed with the keyword you wanna search when you want to search deeply to specific website using Google and you can combine that syntax with other useful google search syntax as explained detail in Google help

14 October 2009

How to handle concurrency in PHP

I know this is too academic to be explained further however this post is intended to be the practical way about how to handle concurrency issue in PHP+MySQL.
Sometimes you will face the problem when you develop application where there is a shared resource and used concurrently by many application process runing. If we are not aware about concurrency then that shared resouce could be broken or miscalculated when more than one process of your application hit the shared resource.

For java, C/C++/C#, Visual Basic and even classic ASP developer, i believe they have experienced about concurrecy and ACID feature of their database like MS-SQL, Oracle, PostgreSQL when implementing into business application like finance, inventory, accounting, etc but I found difficulty when explaining concurrency for PHP developer that only have used PHP+MYSQL(MyISAM) and has religious faith to certain Content Management System or Content Management Framework like Drupal, Joomla, and any other that relying on MySQL(MyISAM).


An example i use here is simple e-commerce stock control of product item. Stock plays important role here to control users when they try to buy/reserve/claim a product item. Logically user allowed to buy/reserve a product item if the product item is still available in the stock. Unless they will see or notified by system that product item is out of stock or sold out.

Lets have alook to the script


<?php

define('MYSQL_USER', 'testuser');
define('MYSQL_PWD', 'password');
define ('MYSQL_HOST', '192.168.56.2');
define('MYSQL_DB', 'concurrency');


function db_connect() {
static $db;
if (!isset($db)) {
$db =new mysqli(MYSQL_HOST,MYSQL_USER, MYSQL_PWD, MYSQL_DB);
if (mysqli_connect_errno()) {
printf("Connect failed: %s\n", mysqli_connect_error());
exit();
}
}
return $db;

}


$stock=db_connect()->query('select ammount from stock where productid=1 and ammount > 0');
$row=$stock->fetch_row();

if ($row[0]>0){

db_connect()->query('UPDATE stock SET ammount=' . ($row[0] - 1) .' WHERE productid=1');
db_connect()->query('INSERT INTO order_product VALUES (null, 1,uuid(),1)');

print '<h2>SUCCESS</h2>';
}else {

print '<h2>STOCK is OUT</h2>';
}

?>




The problem of example(script) above will appear when many user concurrently try to buy the same product. Let's say we have 5 product item in our stock and the number of user want to buy is 10, 50, 100, or 500 concurrently. Do you still rely on this script:

<?php

stock=db_connect()->query('select ammount from stock where productid=1 and ammount > 0');
$row=$stock->fetch_row();

if ($row[0]>0){

db_connect()->query('UPDATE stock SET ammount=' . ($row[0] - 1) .' WHERE productid=1');
db_connect()->query('INSERT INTO order_product VALUES (null, 1,uuid(),1)'); //uuid function used to fake userid

print '<h2>SUCCESS</h2>';
}else {

print '<h2>STOCK is OUT</h2>';
}

?>

I always find some developer believe that the script above is OK and they never think about concurrency and they believe the process will be sequential automatically :-p. Lets do concurrent request to above script using apache benchmark (ab) :


Initial data of stock:


mysql> select stockid, productid from stock;
+---------+-----------+---------+
stockid productid ammount
+---------+-----------+---------+
1 1 5
+---------+-----------+---------+
1 row in set (0.00 sec)



Initial data of Order of product:


mysql> show create table order_product \G
*************************** 1. row ***************************
Table: order_product
Create Table: CREATE TABLE `order_product` (
`orderid` int(4) NOT NULL AUTO_INCREMENT,
`productid` int(4) NOT NULL,
`uid` char(36) NOT NULL,
`orderstatus` int(4) NOT NULL DEFAULT '0',
PRIMARY KEY (`orderid`),
KEY `productid` (`productid`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)



mysql> select * from order_product;
Empty set (0.00 sec)





lets flood using ab and set 50 concurrent requests:


rizky@Elmo:~$ ab -c 50 -n 100 http://www.rb.com/test/db.php
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking www.rb.com (be patient).....done


Server Software: Apache/2.2.11
Server Hostname: www.rb.com
Server Port: 80

Document Path: /test/db.php
Document Length: 19 bytes

Concurrency Level: 50
Time taken for tests: 0.887 seconds
Complete requests: 100
Failed requests: 95
(Connect: 0, Receive: 0, Length: 95, Exceptions: 0)
Write errors: 0
Total transferred: 31175 bytes
HTML transferred: 2375 bytes
Requests per second: 112.77 [#/sec] (mean)
Time per request: 443.362 [ms] (mean)
Time per request: 8.867 [ms] (mean, across all concurrent requests)
Transfer rate: 34.33 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 4 5.1 2 27
Processing: 10 22 11.9 20 96
Waiting: 10 22 11.1 19 88
Total: 14 26 15.3 22 115

Percentage of the requests served within a certain time (ms)
50% 22
66% 25
75% 29
80% 30
90% 34
95% 41
98% 99
99% 115
100% 115 (longest request)



Check to our stock now:


mysql> select * from stock;
+---------+-----------+---------+
stockid productid ammount
+---------+-----------+---------+
1 1 0
+---------+-----------+---------+
1 row in set (0.00 sec)


Check how many order of product we have now :


mysql> select * from order_product;
+---------+-----------+--------------------------------------+-------------+
orderid productid uid orderstatus
+---------+-----------+--------------------------------------+-------------+
1 1 71df191e-b889-11de-b26d-08002705b03f 1
2 1 71edb0e6-b889-11de-b26d-08002705b03f 1
3 1 71edbb9a-b889-11de-b26d-08002705b03f 1
4 1 71edd79c-b889-11de-b26d-08002705b03f 1
5 1 71eddf80-b889-11de-b26d-08002705b03f 1
6 1 71ee06c2-b889-11de-b26d-08002705b03f 1
7 1 71eec0da-b889-11de-b26d-08002705b03f 1
8 1 71eee29a-b889-11de-b26d-08002705b03f 1
9 1 71f1328e-b889-11de-b26d-08002705b03f 1
+---------+-----------+--------------------------------------+-------------+
9 rows in set (0.00 sec)



How come there are 9 orders ? should be 5 right ? cos you only have 5 product item in stock. For developers who believe the order must be 5 using this methods i think need to change you view and need more books to eat. Anyway i think this's fair cos i have seen a lot of PHP book never explain about concurrency and off cource the PHP itself it's not intended to be multithreaded languange.

Do you know PHP IS ALSO UNIX? YES. So We can use POSIX UNIX feature like semaphore to handle this concurrency problem. How ? Lets change the script using semaphore. You can read about semaphore, mutex, or race condition in unix programming book or freely at wikipedia.


<?php
define ('MYSQL_USER', 'testuser');
define('MYSQL_PWD', 'password');
define ('MYSQL_HOST', 'localhost');
define('MYSQL_DB', 'concurrency');


function db_connect() {
static $db;
if (!isset($db)) {
$db =new mysqli(MYSQL_HOST,MYSQL_USER, MYSQL_PWD, MYSQL_DB);
if (mysqli_connect_errno()) {
printf("Connect failed: %s\n", mysqli_connect_error());
exit();
}
}
return $db;

}

function openlock() {
$SEMKEY='123456'; //do you think better to change ftok()
$seg = sem_get( $SEMKey, 1, 0666, -1) ;
return $seg;

}
function releaselock($sem){
sem_release($sem);
}

function getlock($seg) {
return sem_acquire($seg);

}


$sem=openlock();
if (getlock($sem)) {
$stock=db_connect()->query('select ammount from stock where productid=1 and ammount > 0');
$row=$stock->fetch_row();

if ($row[0]>0){

db_connect()->query('UPDATE stock SET ammount=' . ($row[0] - 1) .' WHERE productid=1');
db_connect()->query('INSERT INTO order_product VALUES (null, 1,uuid(),1)');

print '<h2>SUCCESS</h2>';
}else {

print '<h2>STOCK is OUT</h2>';
}
releaselock($sem);
}else {
print '

TRY AGAIN

';

}

?>


Semaphore is one of reliable way to protect variable or shared resource to be used simmultanously by many process although it does not prevent resource deadlock and it does not provide rollback or ACID feature.

Other way you can do are by using mysql transactional storage engine like innodb and mysql get_lock() function.

Using transaction in innodb:
I think this is more reliable than other since transaction has ACID features that guaranty your data to be more consistent and durable.

<?php

define('MYSQL_USER', 'testuser');
define('MYSQL_PWD', 'password');
define ('MYSQL_HOST', '192.168.56.2');
define('MYSQL_DB', 'concurrency');


function db_connect() {
static $db;
if (!isset($db)) {
$db =new mysqli(MYSQL_HOST,MYSQL_USER, MYSQL_PWD, MYSQL_DB);
if (mysqli_connect_errno()) {
printf("Connect failed: %s\n", mysqli_connect_error());
exit();
}
}
return $db;

}


db_connect()->query('SET autocommit=0');
db_connect()->query('START TRANSACTION');



$stock=db_connect()->query('select ammount from stock where productid=1 and ammount > 0 FOR UPDATE');
$row=$stock->fetch_row();

if ($row[0]>0){
db_connect()->query('UPDATE stock SET ammount=' . ($row[0] - 1) .' WHERE productid=1');
db_connect()->query('INSERT INTO order_product VALUES (null, 1,uuid(),1)');
db_connect()->query('COMMIT');
print '<h2>SUCCESS</h2>';
}else {

print '<h2>STOCK is OUT</h2>';
}


?>



Using mysql get_lock function:
I like this function we can define the timeout to acquire the lock in one of the argument.


query('select get_lock("mylock", 10)');
$lock=$mylock->fetch_row();
return $lock[0];
}
function releaselock(){
db_connect()->query('select release_lock("mylock")');

}

function isfreelock() {
$mylock=db_connect()->query('select is_free_lock("mylock")');
$lock=$mylock->fetch_row();
return $lock[0];
}



if (getlock()) {
$stock=db_connect()->query('select ammount from stock where productid=1 and ammount > 0');
$row=$stock->fetch_row();

if ($row[0]>0){

db_connect()->query('UPDATE stock SET ammount=' . ($row[0] - 1) .' WHERE productid=1');
db_connect()->query('INSERT INTO order_product VALUES (null, 1,uuid(),1)');

print '<h2>SUCCESS</h2>';
}else {

print '<h2>STOCK is OUT</h2>';
}
releaselock();
}else {
print '<h2>TRY AGAIN</h2>';

}

?>



You can test the script above using ab and you can set number of concurrent request you like.

09 October 2009

Finally i have Drizzle installed

After figthing with compilation on my Jaunty Ubuntu then i can have my own drizzle.

Start the server:

rizky@Elmo:/usr/local$ sudo drizzled --no-defaults --user=drizzle --basedir=/usr/local/ --datadir=/usr/local/drizzle/data/ >> /usr/local/drizzle/drizzle.err &
[1] 9792
rizky@Elmo:/usr/local$ InnoDB: The InnoDB memory heap is disabled
InnoDB: Mutexes and rw_locks use GCC atomic builtins.
091009 13:55:52 InnoDB: highest supported file format is Barracuda.
091009 13:55:52 InnoDB Plugin 1.0.3 started; log sequence number 48107
Listening on 0.0.0.0:4427
Listening on :::4427
drizzled: Forcing close of thread 0 user: '(null)'
drizzled: ready for connections.
Version: '2009.09.1144' Source distribution (trunk)




Connect as client

rizky@Elmo:~$ drizzle
Welcome to the Drizzle client.. Commands end with ; or \g.
Your Drizzle connection id is 2
Server version: 2009.09.1144 Source distribution (trunk)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

drizzle> select version();
+--------------+
| version() |
+--------------+
| 2009.09.1144 |
+--------------+
1 row in set (0 sec)

drizzle>


There are alot different in feature than mysql http://drizzle.org/wiki/MySQL_Differences

29 September 2009

How many tables in a certain database ?

My friend asked me regarding how to count the number of table in a certain MySQL database. As we might know that we can do this using many ways however that i believe there iss a fastest way to do this rather than making script or parsing SHOW TABLES statement.

MySQL provides Information_Schema containing meta information about your databases, tables, fields, index, privileges, views, and many more meaning we can query it to capture the number of table in a database using the following query :


SELECT count(*) FROM `TABLES` WHERE TABLE_SCHEMA='name-of-database'

11 September 2009

When My Jasper Server Got Trouble

As Business Intelligence tool, it's so frequently used by my company's client to keep track daily, monthly activities in their portal such as membership growth, number of each classifieds ads, trend of specific ad category and many more. These functionalities easily generated in Jasper Server.

Recently i face some error reported with the following error appears when user try to login:


java.lang.NoClassDefFoundError: Could not initialize class org.springframework.webflow.util.RandomGuid
...

root error: org.springframework.web.util.NestedServletException: Handler processing failed; nested exception is java.lang.NoClassDefFoundError: Could not initialize class org.springframework.webflow.util.RandomGuid

From the error text above i previously tot that it caused by no class loaded in lib (library) directory or undefined $CLASSPATH where in Java world there are two kind of classes related exception that i often face : java.lang.ClassNotFoundException and java.lang.NoClassDefFoundError. But when i went trough to RandomGuid of spring webflow source code :
http://www.docjar.com/html/api/org/springframework/webflow/util/RandomGuid.java.html
especially line #130

125   static {
126 secureRandom =
new SecureRandom();
127
long secureInitializer = secureRandom.nextLong();
128 random =
new Random(secureInitializer);
129
try {
130 id = InetAddress.getLocalHost().toString();
131 }
132
catch (UnknownHostException e) {
133 e.printStackTrace();
134 }
135
136 }

There's attempt to get Localhost by invoking InetAddress.getLocalHost().toString() then i believe the problem must be caused by undefined hostname in /etc/hosts so i put the following definition

127.0.0.1 localhost.localdomain
localhost

on /etc/hosts of jasper server and i can login now without error prompted.