Basics of Web Data Mining and Challenges in Web Data Mining Process

Today World Wide Web is flooded with billions ofusers having different profiles, interests and usage
static and dynamic web pages created withpurposes. Every one of these requires good
programming languages such as HTML, PHP and ASP.information but don't know how to retrieve relevant
Web is great source of information offering a lushdata efficiently and with least efforts.
playground for data mining. Since the data stored onIt is important to note that only a small section of the
web is in various formats and are dynamic in nature,web possesses really useful information. There are
it's a significant challenge to search, process andthree usual methods that a user adopts when
present the unstructured information available on theaccessing information stored on the internet:
web.• Random surfing i.e. following large numbers of
Complexity of a Web page far exceeds thehyperlinks available on the web page.
complexity of any conventional text document. Web• Query based search on Search Engines - use
pages on the internet lack uniformity andGoogle or Yahoo to find relevant documents (entering
standardization while traditional books and textspecific keywords queries of interest in search box)
documents are much simpler in their consistency.• Deep query searches i.e. fetching searchable
Further, search engines with their limited capacity candatabase from eBay.com's product search engines or
not index all the web pages which makes data miningBusiness.com's service directory, etc.
extremely inefficient.To use the web as an effective resource and
Moreover, Internet is a highly dynamic knowledgeknowledge discovery researchers have developed
resource and grows at a rapid pace. Sports, News,efficient data mining techniques to extract relevant
Finance and Corporate sites update their websites ondata easily, smoothly and cost-effectively.
hourly or daily basis. Today Web reaches to millions of