Search within Lanny's blog:


Leave me comments so I know people are actually reading my blogs! Thanks!

Tuesday, February 10, 2009

Robot of the Day: CubeStormer, the Rubik's Cube Solver

Ever played Rubik's Cube before? If you have, then you know how hard and how long it takes to solve a game of Rubik's Cube. The robot we talk about today, however, can solve it within seconds, and most impressively, the robot was built completely using Lego pieces from the Lego Mindstorms Kit, which means you could build a robot just like this yourself for only a couple hundred bucks!

The robot's name is CubeStormer, built by British engineer Mike Dobson using Lego Mindstorms parts hooked up to a laptop computer. The computer acts as the brain and performs tasks such as recognizing colors, solving the puzzle using algorithms, and sending motor commands.

As shown in the video below, the robot first quickly inspects all six sides of the cube using multiple cameras by first rotating it a few times to recognize the current state of the cube. The computer vision task is actually really simple because the cube is placed at a fixed position, so the recognition software only needs to sample a few points for each color piece and then simply detect the color of the pixels (one out of six possible choices). The state of the cube is then passed on to a solver software (such as this free online one) and sequences of moves are generated, which are translated into motor commands for the robot to perform.

What is impressive about this robot, though, is the engineering side of things, such as how parts are connected and how motors are used all with toy Lego pieces. A beautiful designed enabled the robot actuators to solve the game in such short period of time. If you look closely at the video, you'll also notice that two rows of the cube can be rotated at the same time to speed it up!

CubeStormer by Mike Dobson

The time it took the robot to solve a random game was about 12 seconds. This is very much comparable to the fastest human Rubik's Cube solvers such as the one shown below.

Rubik's cube official world record 7.08 Erik Akkersdijk

There are of course other Rubik's Cube solving robots in the wild, such as the one built by UC Berkeley shown in the video below, which solved a puzzle in 6 seconds. But apparently this robot would cost a lot more.

Rubik's cube solver by UC Berkeley

However, the Cubinator, aka RuBot II, by Pete Redmond from Dublin, Ireland gets extra point in my book of Human-Robot Interaction. Although much slower compared to the other two robots, it has a head and two arms. And after picking up the cube all by itself, it even played music and talked to the audience while solving the puzzle.

Cubinator by Pete Redmond

What if the Cubinator not only solves Rubik's Cube, but is also capable of playing board games or hide-and-seek with your kids, tell them jokes, read books for them, and help them with their homework? Would you want one for your kids? If so, for how much? If not, why?

Picture of the Day:

Leftover Valentine’s chocolate? Use it to measure the speed of light with your microwave. Click the picture to find out how!

Monday, February 09, 2009

Random Thoughts: Are you being watched?

It started as a great idea at the Lower Merion School District, outside Philadelphia, when school officials decided to loan laptop computers to students to encourage them to embrace newer technologies and to study better. However, the situation came to a dramatic turn when a suit was filed against the school district for spying students at students' homes through the web cam built into the laptop and a software that allowed the school officials to activate the camera remotely in order to view and take pictures of the students.

The interesting part was that a student named Blake Robbins was accused of selling drugs and taking pills by school officials (Blake claimed that he was eating candies), and the school official actually provided proof -- images of Blake eating things at home taken secretly through the web cam -- to back up their claim! The image on the left shows Blake's family and their lawyer appearing on "Early Show Saturday Edition" discussing the entire fiasco (photo credit CBS).

I am simply AMAZED at the intellectual capabilities of the school officials involved here!! The issue here is not whether the kid took pills or not. The issue is about a crude invasion of personal privacy at people's own homes without their knowledge and consent. A federal judge quickly ordered the school district to stop activating the cameras, and the school complied. The FBI has also opened a criminal investigation of the web cam use to see if the school district broke any federal wiretap or computer-intrusion laws.

Disturbing as the story is, what I want to emphasize here, though, is slightly different. Many people use web cams to do video conferences with friends and family. Some still use the old external web cams connected through USB ports (I am one of them). However, most people use the laptop built-in web cams these days because of the convenience and also because most laptops come with built-in web cams. So do you really know if you are being watched?

People these days take their laptops with them everywhere they go, including very private places such as their bedrooms and bathrooms. Many companies would also give their employees laptops so they can work from home or while they go on trips. Many large organizations DO put remote access software on company laptops for the ease of IT support, and I personally have used such software when I worked as an IT support staff in the past. The truth is, IT support staff can remotely watch your monitor screen when you have no idea they are doing so. And if the computer has a web cam, activating the web cam through such management tools is a piece of cake.

Also, many home computers are hacked and made into "zombie" computers for spamming or Denial Of Service attacks in large botnets controlled by hackers. These hackers can also easily control the web cams connected to these infected computers and the users would have no clue about such activities! If you think you are safe behind security firewalls you purchased from McAfee of Symantec, you'd better think twice.

So what lesson should we learn from this? That is, we should never naively believe that we have total control of web cams connected to our computers. The safest thing to do is to cover it up with tape of a piece of paper when we are not actively using it, because you never know who might be watching through the web cam. Especially for people have built-in web cams, it is so easy to forget about that special "eye" in the room, and it might be watching you actively at this very moment!

The commercial in the YouTube video below might seem funny, but it wouldn't be so funny if you weren't in a video conference yet someone is secretly watching you behind the web cam device. I think all built-in web cams should have a cover, so people only open it when they actively use it and can always close the cover when they are not.

This also poses an interesting question about future robots in people's homes. These robots will probably also have eyes, and eyelids will probably have to be mandatory so they don't peep on you when you don't want them to.

Video of the Day:

Sony laptop with built-in web cam ad!



Sunday, February 08, 2009

Paper Review: Evaluation of evaluation in information retrieval

This paper was written by Saracevic from Rutgers University and published at the 18th annual international ACM SIGIR conference on Research and development in information retrieval, 1995.

Evaluation metrics are important components of scientific researches. Through evaluation metrics, performances of systems, algorithms, solution to problems can be measured and compared against baselines or among each other. However, what metrics should we use, at what level, and how good are these metrics? Questions like these must be carefully considered. This paper discussed such concerns about past and existing evaluation metrics used in Information Retrieval (IR) and raised many more questions. Please note that this paper was published in 1995 and evaluation metrics/methods in IR have progressed dramatically by now.

This paper is somewhat a survey paper that discussed evaluation metrics used in IR throughout the history and provided many literature references. The main contribution of the paper is that it suggests looking at the evaluation of IR from a much higher perspective, going back to the true purpose of IR, which is to resolve the problem of information explosion. When considering the evaluation of IR systems from this high point, the paper pointed out that there are a lot more to be evaluated besides common/popular evaluation metrics at simply the process level (e.g. Precision and Recall). It urged the IR community to break out of the isolation of single level narrow evaluations.

The author systematically defined six levels of objectives (engineering, input, processing, output, use and user, and social) that need to be addressed in IR systems together with five evaluation requirements (a system with a process, criteria, measures, measuring instruments, and methodology). Then he further discussed in details current practice, potential problems, and needs of evaluation metrics with respect to each of the requirement and how they can be categorized into the six objective levels. This is an excellent way of organizing contents and arguments, which allows the readers to easily see the big picture in a structured framework.

The paper made a strong statement that “both system- and user-centered evaluations are needed” and more efforts are required to allow cooperative efforts of the two orientations, in contrast to the widely proposed shifting from one to the other. This again highlights the author’s suggestion of treating the evaluation of IR as an overall approach.

The author identified many compelling problems and important issues with regard to the evaluation of IR and argued them well. To name a few: Laboratory collections are too removed from reality and TREC has highly unusual composition as to types and subjects of documents and should not be the sole vehicle for IR evaluation. Applying various criteria in some unified manner still poses a major challenge in IR. Assumption of one and only one relevant set as an answer to a request is not warranted. When using relevance as the criterion with precision and recall as the corresponding measures, someone has to judge or establish relevance; the assumption here is that the judgment is reasonable while we know relevance is a very subjective topic.

The paper repeatedly emphasized evaluation of interaction between users and IR systems as an integral part of the overall evaluation. In recent years, there’s also a strong trend showing more and more researchers in various areas interested in understanding how the human factors and the interaction between human and machines (robots) play an important role in the performance of systems. A good example is the emergence of Human Robot Interaction (HRI). Therefore, this topic deserves a separate discussion here. The ultimate goal of an IR system is to serve human. If information retrieved is not presented to the user correctly, then the IR system fails miserably. Also because of the subjectivity (with respect to an individual user) and ambiguity (such as query term meanings) of IR, multiple rounds of interaction between the user and the IR system can dramatically improve the performance of information retrieval. One example would be retrieving documents related to the query term “Python”. An interactive IR system can further allow the user to specify if he/she wants to retrieval information about the animal or the programming language. As stated in the paper, interactions in IR were extensive studied and modeled, however, interactivity plays no role in large evaluation projects (which I believe is still true even up to today). Granted that it is difficult to come up with sound evaluation metrics for interactivity, more discussion and research in this area is definitely very necessary.

This paper certainly has its shortcomings. First of all, the author could certainly have been more concise in the writing. Additionally I found the comparisons using expert systems and OPAC to be distracting from the main ideas and do not contribute much to the arguments. Eliminating them would have made the paper more focused.

Granted that precision and recall are used as the main evaluation metrics to measure relevance in the system and process level, many other evaluation metrics also existed but were not covered in this paper. Examples include (but not limited to) F-measure, Entropy, Variation of Information, Adjusted Rand Index, V-Measure, Q-2, Log likelihood of the data, etc. Besides quantitive evaluation metrics, qualitative analysis is also a common tool people use to evaluate performances of IR systems, and the paper didn’t touch this subject at all.

The paper argued that it is a problem that “every algorithm and every approach anywhere, IR included, is based on certain assumptions” and these assumptions could have potential negative effects. I found this argument weak and not well constructed. It is true that when researches design algorithms or model problems they make assumptions. Good researchers clearly identify these assumptions in their publications and analyze the potential effects these assumptions have on their algorithms or their models. Sometimes assumptions are made without sound reasons but are justified by good performances from real applications/data. It is almost unavoidable to make various assumptions when we do research. We should not be afraid of making assumptions, but be careful about our assumptions and justify for them.

Lastly, there is one more important drawback of this paper. It did a good job identifying many problems and issues regarding evaluations in IR. However, it did a poor job providing constructive ideas and suggestions to many of these problems. I am not suggesting the author should find solutions to many of these problems, but some initial ideas or thoughts (let it be throw-away or naïve ideas) would have improved the paper considerably.

In summary, the paper succeeded in bringing attentions to treating evaluation in IR from a much higher perspective and also provided good description, references, and discussion for the “current” state (up to 1995) of evaluation metrics in IR. I enjoyed reading this paper.

Video of the Day:

If you have a Toyota, take it in for a check, because it might be a matter of life and death for you and your family!



Saturday, February 07, 2009

Full Moon Crescent Saber: Prologue

Tonight was a night of a beautiful full moon, so I thought it would be the perfect time for me to start this translation project, something I've wanted to do for a long time. :)

Full Moon Crescent Saber is a book started by Gu Long and finished by Sima Ziyan. Because of that, it is also the most controversial book of Gu Long. I like this book because of the unique artistic conception and atmosphere described in the book, which really made it stand out from all other Gu Long books.

============================================================

Full Moon Crescent Saber
-- Written by Gu Long, Translated by Lanny Lin



-->
Prologue
Full Moon
The moon may be wax or wane. The story we are telling here is about the full moon, because it happened at a night of the full moon. At that night, the moon was more beautiful than any other night, with a magnificence so mysterious, so bleak, and so heart-breaking.
Same goes with the story we are telling, a story filled with charms so mysterious yet beautiful and fantasies so mystifying yet stunning. As told in the ancient mythical tale, when the moon rises in the dark nights, there are always fairies dancing in the moonlight – fairies of the flowers, fairies of the gems and sapphires. Even dark souls and fairy foxes living deep underground would come out to worship the full moon and to draw in the vigor of the bright moonlight.
Sometimes they will even transform into human forms, live in the human world as many different characters, and do things no one would ever imagine.
These things are sometimes startling, sometimes heartwarming, sometimes frightening, sometimes exhilarating, and some other times beyond imagination. They could rescue someone from the deepest abyss; they could also shove someone off the steepest cliff.
They could give you all the fame and fortune in the entire world; they could also make you lose everything you’ve ever own.
No one has ever seen their true faces, but no one could deny their existence either.
Crescent Saber
A saber may be straight or crescent. What we want to talk about here is a crescent saber, as curvy as Qing-Qing’s eyebrows.
The crescent saber belonged to Qing-Qing. Qing-Qing is a beautiful and mystic girl, just like the full moon of that night.
Sabers are weapons made to kill.
Same goes with Qing-Qing’s crescent saber. When the crescent reflection flashes by, calamity befalls; no one can escape the calamity, because no one can get away from the crescent shine of the saber.
The shine of the saber is not hasty; it is as smooth as the moonlight, but as soon as you see the moonlight, it has already befallen upon you.
There is only one moon in the sky; there is also only one crescent saber on earth.
It doesn’t always bring calamity when it appears in the mundane world. Sometimes it also brings people righteousness and happiness.
So when it appears in the world once again, what will it bring to this world this time?
No one knows.
Qing-Qing’s crescent saber is also emerald green[1], as green as the verdant distant mountain, as green as the spring trees, and as green as tears in young lovers’ eyes.
On the emerald green and crescent-shaped blade is a line of tiny words, “All night in the attic I hear the spring sprinkle[2].”
Fortunes may be as unpredictable as the winds and clouds in the sky. The moon may be dim or bright, wax or wane. Perfection in life was never easy to come by.
May we all be blessed with longevity. Though miles apart, our hearts still cross through the beautiful moon high in the sky.[3]


[1] In Chinese, “Qing” means emerald green.
[2] A verse from a poem of Lu You (1125-1210 AD), a poet from the Song Dynasty. I will not attempt to translate the entire poem here.
[3] The last few verses are excerpts of a very famous Mid-Autumn Festival poem by Su Shi (1037-1101 AD), a poet from the Song Dynasty. Here’s my poor attempt at translating the poem:
Prelude to the Melody of Water
When did the bright moon first ever appear?
Raising my wine cup I ask the blue sky.
High above in the moon palaces,
Wonder what year it is tonight.
I want to ride the wind and fly to the moon,
But I fear the jade terrace is too cold and high.
I’d rather stay in the human world,
And dance with my shadow in the moonlight.
As the moon rounds the red pavilion and slants through the silk-pad windows,
It shines upon every wakeful eye.
Moon, are you bearing any grudges.
Why always the full moon when loved ones are not nearby?
People may have joy and sorrow, parting and reunion,
The moon may be dim or bright, wax or wane.
Perfection in life was never easy to come by.
May we all be blessed with longevity, though miles apart,
Our hearts still cross through the beautiful moon high in the sky.

Now support the translator Lanny by following my blog and leaving comments! :)


Videos of the Day:


The beautiful poem referenced above was turned into the lyrics of a beautiful song sang by Teresa Teng, a huge pop icon in the 70s and 80s in the last century. Enjoy!



While searching for Teresa Teng's video, I ran into the video below on YouTube and was very impressed by the talent shown. A girl used her own music composition for the same poem and showed her beautiful voice. Even if you don't understand a word of hers, you'd still enjoy it (someone left a comment saying exactly that)!

Friday, February 06, 2009

AI and Robots: BYU using computer vision to catch parking violators

In the past, faculty members, staffs, and students at BYU (Brigham Young University) had to obtain and place special stickers on the windshield of their cars every semester if they want to park on campus in designated parking lots. Starting in Fall 2009, thanks to the new computer vision technology adapted by BYU police, this step is no longer necessary.

There are four types of parking lots at BYU: Faculty and Staff Parking, Graduate Student Parking, Undergraduate Student parking, and Visitor Parking. Because faculty parking lots are everywhere on campus, while graduate student and undergraduate student parking lots are relatively further away from the center of the campus (graduate parking lots are slightly closer), many students are tempted to park at faculty parking lots just briefly for a class period of about one-hour. Many used to be able to get away from it because of limited parking officers, but that is probably coming to an end because campus police has a better weapon to fight parking violators.

An automatic license plate recognition system, developed in Israel (I suspect this company) and made its way into US through Canada (don't ask me why), has become a very powerful tool for BYU police to catch parking violators. Cameras installed on top of police cars (as shown in the picture on the left) can automatically take pictures of cars in the parking lots. License plates are recognized and matched against a database to quickly determine if the car can park at the parking lot. An alarm is played when a violator is identified, and with just a push of a button, a parking ticket is automatically generated and printed. Parking officers can now quickly drive around campus multiple times each day and get their job down all with the comfort of sitting in their seats.

The picture on the right shows a closer view of the type of camera in use. The same kind of camera is also used at gated areas to automatically raise the gate for eligible cars. The accompanying software can read 60 plates a second and can recognize a license plate on a car going 120 mph with the help of the high-speed camera and fast computer algorithms in recognizing numbers and letters. The system also a GPS built-in, so images of cars are also geo-tagged with GPS locations in case people forget where they parked their cars.

Inside the police car, a very durable tablet PC is mounted on the panel so the parking officer can interact with the software using a stylus pen. A wireless keyboard can also be used to enter license numbers into the system.

Obvious benefits of the system include: more efficient patrol of lots of parking lots, comfort of staying in the car in extreme weathers (hot or cold), and automatic alert for stolen vehicles. However, this technology also has its drawbacks. For example, in cases of heavy snow (which is not so rare in Utah), the license plate might be covered by snow and not visible. Also since the parking officers can now do most of their job without getting out of the car, special parking spots like the 15-minutes ones are getting less attention and could be abused more frequently. In the past, people who owned multiple vehicles had the option of hanging a badge in one of the cars. This also means only one car is allowed to be parked on campus because there is only one badge. With the new system, since there's no sticker and no badge, all cars can be parked on campus at the same time. Lastly, privacy is also a concern because now the campus police can easily identify when cars are parked where each day.

So how does the recognition work? There are two main challenges: 1. Identify the license plate in the picture. 2. Recognize the license number. I don't know the exact algorithms used in the system, but based on techniques learned form my computer vision class, I certainly can come up with some intelligent guesses. Identifying the license plate in a picture probably relied on edge detection techniques combined with detecting high-contrast areas that also have the rectangular or rhombus shape. A coarse to fine search is also likely. Recognizing the letters and numbers is relatively easy with machine learning classification algorithms such as decision tree or nearest neighbor.

It is worth mentioning that such license plate recognition systems are already widely used by police forces. The video below shows an example. If you live in California, then you probably have heard stories where people get their traffic ticket in the mail together with a picture of their license plate. A friend of mine told me that once he actually received a ticket in the mail together with a link. Following the link, he was able to view a video of himself making a right turn without making a complete stop. How amazing!




A BYU parking officer said the following in an interview:
"With the money we saved in parking sticker costs, we were able to buy the car."

What I probably would add to that is: "With the extra parking tickets we were able to write, I am expecting a much bigger bonus!" Just kidding!

Picture of the Day:

Google street view uses facial recognition software agent to detect faces in photos and then blur them for privacy protection. The software agent dutifully blurred hunger striker Bobby Sandss's face in a street portrait in Belfast. (Click the picture to see more!)

Thursday, February 05, 2009

My Research: BYU UAV Demo for Utah County Search and Rescue Team

On November 21, 2009, our research group, WiSAR (Wilderness Search and Rescue) demonstrated our UAV technologies to the Utah County Search and Rescue team representatives at Elberta, Utah. Three search and rescue personnel participated in the demo and one of them flew the UAV in a simulated search and rescue exercise.

In two previous blog postings I described BYU research on using UAV to support Wilderness Search and Rescue and UAV capabilities:

My Research: BYU UAV Demo Dry Run
Robot of the Day: UAVs at BYU

The demo was scheduled at 8:30-11:00 am at Elberta, Utah (in the middle of nowhere), which was about an hour's drive from BYU campus. That meant we had to get there by 8 to set up and test equipments. The previous day's weather forecast predicted snow shower, so I was assigned the task of picking up some hot chocolate from the BYU cafeteria so people don't freeze to death!

Despite the facts that I had to deal with my 10-month old son's high fever at 1:30am and not really fall asleep until 3:30am and unconsciously turned off my alarm clock, I actually made it to the cafeteria only 5 minutes late, then I waited another 25 minutes because they haven't made the hot chocolate yet. By the time I arrived at the demo site at 8:30am, turned out the trailer just got there also, so I didn't miss anything! Also, turned out the weather forecast was way off, there was no snow at all, and it was going to be a great day!


Left to right, top to bottom: 1. BYU Cafeteria 2. Beautiful Utah mountains at Dawn
3. The lonely freeway 4. Driving down the highway 5. Good morning, Cows!
6. Gravel road with the destination in view (the ridge in the far distance).


The pictures above were taken by me using an android phone running NASA's GeoCam mobile client. Therefore, all photos were geo-tagged with GPS locations and camera orientation. You can actually view them from Google Earth, where you'll see the exact route I took on the map. Just download the zip file, unzip, and then double click the kml file.

Viewing pictures from Google Earth

The goal of the demo is to show real search and rescue workers how easy and useful our UAV technologies are in support of search and rescue operations. A simulated search and rescue mission was set up, a member of the search and rescue team had to fly the UAV using our interface and locate the simulated missing person (a dummy placed in the wilderness). Students and professors from BYU also acted as aerial video analysts and ground searchers to assist the simulated search. The picture blow shows a ground searcher scouting around in the distance searching for the missing person. The ground searchers always wear bright-colored vests so they can be easily spotted by others (e.g. from the aerial videos) and don't get shot at by hunters. (I know, research is a dangerous profession!)


Ground searcher in a distance (click photo to enlarge)


After setting up everything, Ron Zeeman, a member of the Utah County Search and Rescue team, test flew the UAV, and completed a test drill (launch, manual control, fixed pattern flying, and landing).

Left to right: 1. People busy setting things up 2. UAV at dawn 3. Last minute exercise

After other Search and Rescue team members arrived, we explained how our UAV works, and then started the simulated search and rescue mission. This time I was quite lucky to catch the flying UAV with my camera.

Left to right: 1. Two more professional searchers arrived 2. Two retired UAVs in display 3. The show is on now!

Left to right: 1 and 2. UAV in the air 3. UAV loitering above area of interest (click to enlarge)

Left to right: 1. The kind of junk people would dump to the middle of nowhere 2. Debris of camp fire 3. Real-time video mosaicing (frame stitching)

Eventually, the missing person was located in the aerial video and confirmed by ground searchers. "Unfortunately", by the time we found "him", he was not breathing.


The "missing person" was found, breathless.

Technologies demonstrated include auto launch, auto land, various UAV control mode (carrot and stick, fixed pattern flying, etc.), integrated gimballed camera view in augmented virtuality, click and point gimballed camera control (separate from UAV path), real-time video mosaicing, real-time video annotation and video zooming/scrubbing, point of interest communication between video GUI and UAV control GUI.

Technologies not demonstrated but are work in progress include automatic missing person probability distribution generation, automatic path planning (based on distribution), see-ability metric to measure coverage quality, and automatic anomaly detection.

The demo was a great success! The professional searchers were pleasantly surprised by the ease of operating the UAV and the usefulness of the aerial video support. Their comments included, "That was so cool!" "This could be very helpful!"




Video of the Day:

If the UAV sitting in our lab had a mind of its own,
it would have been singing this all night long...

Wednesday, February 04, 2009

Paper Review: Distributional Clustering of Words for Text Classification

This paper was written by Baker from Carnegie Mellon and McCallum from Justsystem Pittsburgh Research Center, published at 21st annual international ACM SIGIR conference on Research and development in information retrieval, 1998.

This paper applies Distributional Clustering to document classification. Distributional Clustering can reduce even space by joining words that induce similar probability distributions among the target features that co-occur with the words in questions. Word similarity is measured by the distributions of class labels associated with the words in question.

The three benefits of using word clustering are: useful semantic word clusterings, higher classification accuracy, and smaller classification models.

Clustered Word Clouds by Jeff Clark, perfect in memory of Dr. King on this year's Martin Luther King Day!

The paper first went over the probabilistic framework and Naïve Bayes, and then suggests using weighted average of the individual distributions as the new distribution. Kullback-Leibler divergence, an information-theoretic measure that can be used to measure difference between two probability distributions is introduced, and then the paper uses “KL divergence to the mean”. Instead of compressing two distributions optimally with their own code, the paper uses the code that would be optimal for their mean. Assuming a uniform class prior, choosing the most probable class by naïve Bayes is identical to choosing the class that has the minimal cross entropy with the test document. Therefore, when words are clustered according to this similarity metric, increase in naïve Bayes error is minimized.

The algorithm works as the following: The clusters are initialized with the M words that have the highest mutual information with the class variable. The most similar two clusters are joined, then the next word is added as a singleton cluster to bring the total number of clusters back up to M. This repeats until all words have been put into one of the M clusters.

The paper uses three real-world text corpora: newswire stores, UseNet articles and Web pages. Results show that Distributional Clustering can reduce the feature dimensionality by three orders of magnitude, and lose only 2% accuracy.

Distributional Clustering performs better than feature selection because merging preserves information instead of discarding it. Some features that are infrequent, but useful when they do occur, get removed by the feature selector; feature merging keeps them.




I have a dream that one day when I get old, there will be intelligent robots to take care of people like me, so we can enjoy life freely, happily, and independently.






Tuesday, February 03, 2009

Joy of Life: Volume 1 Chapter 4

Volume One: The City by the Sea
-- written by Maoni

Chapter 4: Late Night Visitor

“Are you thinking about something?”
The little girl sitting by Fan Xian’s right hand side asked with pouted lips while the two servant girls were busy setting up the dinner table. The little girl was a bit skinny and had somewhat darkish skin. Sitting right next to Fan Xian, whose face was almost as pretty as a girl, the little girl appeared even more pitiful.
Fan Xian reached out and rubbed the yellowish hair on the little girl’s head.
“I am thinking about what kind of tasty food you get to eat every day in the Capital City,” he grinned.
The little girl was the daughter of the Count of Southernland, Fan Xian’s younger sister. They shared the same biological father but had different mothers. Her name was Ruo-Ruo.
She had always been feeble since birth. The Old Madam loved her granddaughter very much, so she sent for her a year ago and kept her in Danzhou for recuperation. A year had already gone by and there still wasn’t much of an improvement. The hair on her head still looked sparse. Born into the family of a government official, she never lacked warm clothing or good food. Her symptoms must have resulted from premature labor, not malnutrition.
Fan Xian found himself quite fond of the little girl. Even though he dealt with her using the attitude of an uncle, treating her as an adorable little kid, playing with her, telling her stories, in everyone else’s eyes, such an attitude became the clear proof of a loving brother-sister relationship.
Nevertheless, due to Fan Xian’s awkward status – a baseborn son, which is very different from a legitimate daughter – the servant girls intentionally avoided mentioning anything regarding the other Count’s Manor in the Capital City.
Since the brother asked, the little girl sincerely started counting with her fingers the tasty things she had enjoyed back in the Capital City. The memory of a three-year-old is surely quite limited, so all she did was to repeat candied haws[1] and dough figurines[2].
It was quite late when dinner was over. The setting sun was already half hidden by the other side of the continent, and the dense twilight began to envelop the entire manor.
“Alas! Ruo-Ruo, you are indeed a Weak-Weak[3]!”
“Brother is teasing again!”
“Alright, alright! What story would you like to hear today?”
“Snow White.”
A big grin suddenly appeared on Fan Xian’s face out of nowhere. Luckily there was no one else around; otherwise one would be shocked to spot such a queer grin only capable by grown-ups appearing on the face of a four year old boy.
“How about a ghost story?”
“No! I don’t want it.” Fan Ruo-Ruo was quite frightened, shaking her head vigorously. Tears quickly formed around her eyes and two streams of tears soon rolled down her darkish face. Evidently, she had been tormented by ghost stories many times in the past year.
Teasing this little girl was only one of Fan Xian’s many vulgar hobbies. What he was best at was bantering with those servant girls. He frequently told ghost stories to those youthful blossoming girls, who always ended up jostling tightly into each other’s embraces screaming and shivering on top of the bed.
Although it was out of the question for Fan Xian to flirt with the young girls vocally for the sake of concealing his true self, he always enjoyed the sweet and tender hugs at these times.
He would always reassure himself with the argument that as a young kid, he was still in a phase where touching was very desirable. So what he did was completely normal and warranted, not something shameful.
Every time the servant girls became curious as to how he could have known so many terrifying stories, Fan Xian would always make his teacher liable. The direct result was that all the servant girls now eyed the teacher with a disgruntled look. “The Count is paying a handsome salary for you to give lessons to the Young Master, yet you teach him ghost stories. It is already evil to scare a little kid, but it is even more evil to scare us, the blossoming flowers.”
After the usual evening ghost story telling was over, the two servant girls, still carrying a mixture of frightened yet satisfied looks, attended to the little guy’s evening hygiene routines and then shut the door to let him sleep.
It seemed to be just another ordinary night.
Fan Xian pushed the hard and uncomfortable porcelain headrest[4] to the side, and then took out a winter robe from the chest of drawers. Folding it into a nice rectangular shape, he made himself a “pillow”.
He rested his head against the “pillow”, but his two eyes remained open. They shined dimly in the dark night as he remained awake for a long time.
He had accepted the fact that he was reborn into this world, but he was still not used to the customs. It should have been around nine o’clock in the evening, and it was not a comfortable feeling to sleep so early. Besides, he had already slept too much on the sickbed in his previous life.
He stroked the surface of the bed board with his hand and felt better with the conclusion that no one could easily spot the secret casing he had made. As he became more relaxed, naturally, the inner energy inside him began to circulate gradually and he slowly approached the state of meditative trance.
“What kind of life should I live in this world? And how should I spend the next few decades? Maybe I’ll even have many wives and concubines like an ordinary noble man,” Fan Xian’s mind wandered.
Just on the edge of entering the emptiness of mind, Fan Xian was suddenly awoken by an unannounced visitor.
……
……
“Are you Fan Xian?”
A man appeared by his bed all of a sudden. With a grain of abnormal brown in the pupils, his eyes only showed the coldest apathy, clearly indicating his indifference to life.
The question was actually asked nicely. But if such a question was asked by someone who had sneaked into your bedroom at midnight, wearing a mask on his face, holding a dagger in his hand, with a few small bags pinned to his waist, the question could be very terrifying.
If Fan Xian had been a real four-year-old boy, he would surely have screamed on the top of his lungs at the first sight of this queer uncle.
Even if he did all his thinking with his toes Fan Xian could still comprehend that a night traveler capable of sneaking into the Count’s Manor without triggering any alarms had to be someone with high caliber Kung Fu and very possibly, a cruel mind. If he had made any attempt to scream, most likely the man would not hesitate to break his neck in a split second.
At that thought, Fan Xian couldn’t help but feel quite pleased with his calmness. Working hard to suppress the growing uneasiness, he cleared his throat gently. Then putting on the most lovely baby face he could manage, he threw himself forward!
……
……
“Daddy, you are finally home!”
The four-year-old boy dove into the assassin’s arms, tears streaming, and held onto his waist tightly. But the boy’s arms were too short and could not completely circle around the waist. So instead, he grabbed the man’s robe tightly as though he was afraid the man would just suddenly disappear.
Perhaps the boy used too much strength when grabbing the robe, with a tearing sound, a strip of the robe came off.
The man frowned. Without any obvious movement, he suddenly freed himself from Fan Xian’s hug, and then just stood there, dumbstruck, as if he was still trying to work on why this baseborn son of the Count of Southernland had called him Daddy.
Meanwhile, he was also very puzzled. His robe was the top of the line gear of the Bureau. Even a knife wouldn’t have easily cut through it. How was the young kid able to easily tear it apart?
Fan Xian was even more baffled, so much so that he felt as though his heart began to bleed – during times when he managed to be alone, he had always experimented with the power of the nameless inner energy inside him using rocks in the rockwork hill in the courtyard. When he found out that he could almost manage to crush those not-too-hard rocks into bits and pieces with his tender little fingers, he started building some confidence in his self-defense ability.
That wasn’t an easy task to lower the man’s guard with the tears of a four-year-old. Directing all the inner energy to his fingers, Fan Xian thought he had a pretty good chance to subdue his opponent. But who could have imaged that all he was able to achieve was to tear a few strips off the man’s robe.
Something big was bound to happen now.


[1] http://www.lannyland.com/jol/TangHuLu.jpg
[2] http://www.lannyland.com/jol/MianRenEr.jpg
[3] The Chinese character for “weak” has the same pronunciation as the character in the girl’s name.
[4] Typical in ancient China before pillow was invented.


Now support the author Maoni by clicking this link, and support the translator Lanny by following my blog! :)


Video of the Day: Matrix running on Windows XP

Click on the picture to view video on YouTube because video embedding was disabled.