Continued from previous post here, Lewandowski
(2012) in his research work on web search engine credibility, he
discussed the issue of delivering highly credible search results to
search engine users. He further emphasized the criteria by which search
engines decide upon including documents in their indices. These criteria
include:
· Text-based matching: matches queries and documents to find documents that fulfill the query.
· Popularity: which are measured based on clicks, links that lead to the page etc.
· Freshness: the newness and up-to-date of the document.
· Locality: knowing the locality of the user is paramount in giving useful results.
· Personalisation: giving results based on user’s search habits.
He argued that popularity lies at the heart of these systems.
Search engines use a lot of page ranking algorithms to carry out the indexing of web pages.
Chandra,
Suaib & Beg (2015) outlined and briefly discussed Google search
algorithm updates against web spam some of which include: Page Rank,
panda and penguin among a list of many others. According to them,
Page Rank counts the number and quality of links to a page to calculate a
rough estimate of a website's global importance. They further said that
it can be assumed that important websites are more likely to receive
high number of links from other websites and that initially, Google's
search engine was based on Page Rank and signals like title of page,
anchor text and links etc. Chandra, Suaib & Beg (2015) further
stated that currently, Google search engine uses more than 200 signals
for ranking of web pages as well as to combat web spam. Google also uses
the huge amount of usage data (consisting of query logs, browser logs,
ad-click logs etc.) to interpret complex intent of cryptic queries and
to provide relevant results to end user.
In their research, they
explained that the panda update aimed to lower rank of low quality
websites and increased ranking of news and social networking sites.
Panda is the filter to down rank sites with thin content, content farms,
doorway pages, affiliates websites, sites with high ads-to-content
ratio and number of other quality issues. Panda update affects ranking
of entire website rather than individual page. It includes new signals
like data about the site users blocked via search engine result page
directly or via the chrome browser.
Another important algorithm
update is the penguin update. This update is purely web spam algorithm
update. It adjusts a number of spam factors including keyword stuffing,
in-links coming from spam pages, anchor text/link relevance. Penguin
detects over optimization of tags and internal links, bad neighborhood,
bad ownership etc. Subscribe here for more.
Digital Solutions, Articles And Posts On Agribusiness, Blogging, Business, Web development, Digital Skills Upgrade, Social media marketing, Trends, Currencies Assets, NF Tokens, Tips, Strategies, Opportunities, Tools, Artificial Intelligence and More.
Join over 38,000 friends and followers on Twitter
A Review Of Journal Article On Search Engine Operations (Part 2)
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: If Your Comment Is Irrelevant Or Inappropriate, It Will Be Removed. The Views Expressed In The Comments Do Not Necessarily Represent That Of The Owner Of The Blog. For more information see terms of use and privacy policy link. Reach 0092348033451818 for more details. Thank you for visiting.