Yahoo! Mindset, automatic content classification

links — THE HYPERGURU @ 10:05 am

Why Machine Learning is useful for the Web?

Machine learning is especially useful for applying human-like behavior to sets of data so large that it would be infeasible for humans to do the work. When the Web took off about ten years ago, machine learning acquired a cherished prize: a huge, and ever-growing corpus of data. With billions of pages and counting, the Web is too big for humans to encompass entirely. This is where machine learning comes in.

the machine learning technology can only be as “smart” as the humans who generated the seed set, improving that set will improve the accuracy of the demo.

A Yahoo! Research Labs demo that applies a new twist on search that uses machine learning technology to give you a choice: View Yahoo! Search results sorted according to whether they are more commercial or more informational (i.e., from academic, non-commercial, or research-oriented sources).

You control the slider to decide how you want the results sorted. The midpoint of the slider represents the default setting. In this position, the order of results matches Yahoo! Search web results. As you move the slider right, toward “researching” or left toward “shopping” the results are automatically re-sorted for you.

Commercial implies that the primary purpose of a given page is to sell you something. Informational implies that the primary purpose of the page is to provide information related to your search.

This Mindset demo is an example of machine learning applied to the problem of text classification. Machine learning and text classification are two different fields of technical research that found common cause about ten years ago with the emergence of the Web.

Remember, this demo is a work in progress, put together by scientists to test new ideas and techniques. To start the scoring process, a small team of humans scored pages manually to develop the “seed set” of pages on which machine learning would be based. For the seed set, we didn’t rigorously require everyone to use the same scoring approach, so the scoring results may need some fine-tuning.

Yahoo! Mindset

0 Comments »

No comments yet.

RSS feed for comments on this post.

Leave a comment

You must be logged in to post a comment.

| HYPERGURU