0% found this document useful (0 votes)

19 views56 pages

SEO: The PAGE RANK Algorithm: Presidency University, Bengaluru School of Engineering

The document discusses PageRank, the algorithm used by Google to rank web pages in search results. It explains how PageRank analyzes the entire structure of hyperlinks across the web to measure the importance of a page. Key aspects covered include how PageRank considers both the number and quality of links to a page, as well as the importance of pages linking to it, and how it can be modeled using a Markov chain.

Uploaded by

Harsha Vardhan Gorantla

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views56 pages

SEO: The PAGE RANK Algorithm: Presidency University, Bengaluru School of Engineering

Uploaded by

Harsha Vardhan Gorantla

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

PRESIDENCY UNIVERSITY, BENGALURU

SCHOOL OF ENGINEERING

SEO: The
PAGE RANK algorithm
Tapas Guha, Assistant Professor, CSE
Presidency University, Bangalore
Contents
• Introduction
• Search Engine Optimization( SEO)
• Citation Analysis: The beginning of Ranking
• The Page Rank Algorithm
• The Page Rank Calculation
• Markov Chains
• Dangling Links
• Some PageRank Algorithm Variations
• Parallel Page Rank Algorithms
• HITS
• Limitations of HITS
• Conclusion

References
Introduction

• PageRank is a numeric value that represents how important a page is on the web. Google
figures that when one page links to another page, it is effectively casting a vote for the
other page. The more votes that are cast for a page, the more important the page must be.
Also, the importance of the page that is casting the vote determines how important the vote
itself is. Google calculates a page’s importance from the votes cast for it. How important
each vote is taken into account when a page’s PageRank is calculated.
• PageRank is Google’s way of deciding a page’s importance. It matters because it is one of
the factors that determines a page’s ranking in the search results. It isn’t the only factor that
Google uses to rank pages, but it is an important one.
The web as a directed graph

4
The web as a directed graph

5
The web as a directed graph

 Assumption 1: A hyperlink is a quality signal.

6
The web as a directed graph

 Assumption 1: A hyperlink is a quality signal.

 The hyperlink d1 → d2 indicates that d1‘s author deems d2
high-quality and relevant.

7
The web as a directed graph

 Assumption 1: A hyperlink is a quality signal.

 The hyperlink d1 → d2 indicates that d1‘s author deems d2
high-quality and relevant.
 Assumption 2: The anchor text describes the content of d2.

8
The web as a directed graph

 Assumption 1: A hyperlink is a quality signal.

9
The web as a directed graph

 Assumption 1: A hyperlink is a quality signal.

 The hyperlink d1 → d2 indicates that d1‘s author deems d2
high-quality and relevant.
 Assumption 2: The anchor text describes the content of d2.
 We use anchor text somewhat loosely here for: the text
surrounding the hyperlink .
 Example: “You can find cheap cars ˂a href =[Link] ˂/a ˃. ”

10
The web as a directed graph

 Assumption 1: A hyperlink is a quality signal.

12
[text of d2] only vs. [text of d2] + [anchor text → d2]

 Searching on [text of d2] + [anchor text → d2] is often

more effective than searching on [text of d2] only.

13
[text of d2] only vs. [text of d2] + [anchor text → d2]

 Searching on [text of d2] + [anchor text → d2] is often

more effective than searching on [text of d2] only.
 Example: Query IBM

14
[text of d2] only vs. [text of d2] + [anchor text → d2]

 Searching on [text of d2] + [anchor text → d2] is often

more effective than searching on [text of d2] only.
 Example: Query IBM
 Matches IBM’s copyright page

15
[text of d2] only vs. [text of d2] + [anchor text → d2]

 Searching on [text of d2] + [anchor text → d2] is often

more effective than searching on [text of d2] only.
 Example: Query IBM
 Matches IBM’s copyright page
 Matches many spam pages

16
[text of d2] only vs. [text of d2] + [anchor text → d2]

 Searching on [text of d2] + [anchor text → d2] is often

more effective than searching on [text of d2] only.
 Example: Query IBM
 Matches IBM’s copyright page
 Matches many spam pages
 Matches IBM wikipedia article

17
[text of d2] only vs. [text of d2] + [anchor text → d2]

 Searching on [text of d2] + [anchor text → d2] is often

18
[text of d2] only vs. [text of d2] + [anchor text → d2]

 Searching on [text of d2] + [anchor text → d2] is often

19
[text of d2] only vs. [text of d2] + [anchor text → d2]

 Searching on [text of d2] + [anchor text → d2] is often

more effective than searching on [text of d2] only.
 Example: Query IBM
 Matches IBM’s copyright page
 Matches many spam pages
 Matches IBM wikipedia article
 May not match IBM home page!
 … if IBM home page is mostly graphics
 Searching on [anchor text → d2] is better for the query IBM.

20
[text of d2] only vs. [text of d2] + [anchor text → d2]

 Searching on [text of d2] + [anchor text → d2] is often

22
Anchor text containing IBM pointing to [Link]

23
Indexing anchor text

24
Indexing anchor text

 Thus: Anchor text is often a better description of a page’s

content than the page itself.

25
Indexing anchor text

 Thus: Anchor text is often a better description of a page’s

content than the page itself.
 Anchor text can be weighted more highly than document
text.
(based on Assumption 1&2)

26
Exercise: Assumptions underlying PageRank

27
Exercise: Assumptions underlying PageRank

 Assumption 1: A link on the web is a quality signal – the

author of the link thinks that the linked-to page is high-
quality.

28
Exercise: Assumptions underlying PageRank

 Assumption 1: A link on the web is a quality signal – the

author of the link thinks that the linked-to page is high-
quality.
 Assumption 2: The anchor text describes the content of
the linked-to page.

29
Exercise: Assumptions underlying PageRank

 Assumption 1: A link on the web is a quality signal – the

author of the link thinks that the linked-to page is high-
quality.
 Assumption 2: The anchor text describes the content of
the linked-to page.
 Is assumption 1 true in general?

30
Exercise: Assumptions underlying PageRank

 Assumption 1: A link on the web is a quality signal – the

author of the link thinks that the linked-to page is high-
quality.
 Assumption 2: The anchor text describes the content of
the linked-to page.
 Is assumption 1 true in general?
 Is assumption 2 true in general?

31
Google bombs

32
Google bombs
 A Google bomb is a search with “bad” results due to maliciously manipulated anchor
text.

33
Google bombs
 A Google bomb is a search with “bad” results due to maliciously manipulated anchor
text.
 Google introduced a new weighting function in January 2007
that fixed many Google bombs.

34
Google bombs
 A Google bomb is a search with “bad” results due to maliciously manipulated anchor
text.
 Google introduced a new weighting function in January 2007
that fixed many Google bombs.
 Still some remnants: [dangerous cult] on Google, Bing, Yahoo

35
Google bombs
 A Google bomb is a search with “bad” results due to maliciously manipulated anchor
text.
 Google introduced a new weighting function in January 2007
that fixed many Google bombs.
 Still some remnants: [dangerous cult] on Google, Bing, Yahoo
 Coordinated link creation by those who dislike the Church of
Scientology

36
Google bombs
 A Google bomb is a search with “bad” results due to maliciously manipulated anchor
text.
 Google introduced a new weighting function in January 2007
that fixed many Google bombs.
 Still some remnants: [dangerous cult] on Google, Bing, Yahoo
 Coordinated link creation by those who dislike the Church of
Scientology
 Defused Google bombs: [dumb motherf…], [who is a failure?], [evil empire]

37
How far do people look for results?

(Source: [Link] WhitePaper_2006_SearchEngineUserBehavior.pdf) 38

Search Engine Optimization(SEO)
• With the rapid growth of WWW most of the users use information retrieval tools like search engines to find
information from the web.

• There are tens and hundreds of search engines available but some are popular like Google, Yahoo, Bing etc.,
because of their crawling and ranking methodologies.

• The search engines download, index and store hundreds of millions of web pages. They answer tens of
millions of queries every day. So Web mining and ranking mechanism becomes very important for effective
information retrieval.

• Before presenting the pages to the user, a ranking mechanism is done by the search engines to present the most
relevant pages at the top and less relevant ones at the bottom.
Citation Analysis: The beginning of Ranking
• Citation frequency
• Bibliographic coupling frequency-
• Articles that co-cite the same articles are related .
• Citation indexing.
• But, the web is not a scholarly citation. Millions of participants, each with self interests.
• Spamming is widespread.
• Once search engines began to use links for ranking (roughly 1998), link spam grew.
• One can join a link farm – a group of websites that heavily link to one another.
The Page Rank Algorithm
• "PageRank can be thought of as a model of user behavior. We assume there is a "random surfer" who is given a
web page at random and keeps clicking on links, never hitting "back" but eventually gets bored and starts on
another random page. The probability that the random surfer visits a page is its PageRank.“
• Quoting from the original Google paper, PageRank is defined like this [1]:
We assume page A has pages T1…Tn which point to it (i.e., are citations). The parameter d is a damping factor
which can be set between 0 and 1. We usually set d to 0.85. There are more details about d in the next section. Also
C(A) is defined as the number of links going out of page A. The PageRank of a page A is given as follows:

PR(A) = (1-d) + d (PR(T1)/C(T1) + … + PR(Tn)/C(Tn))

PR(Tn) – Page Rank of page Tn, C(Tn) –The count, or number, of outgoing links for page n, PR(Tn)/C(Tn) –The
share of the vote page A will get from Tn,
d – All these fractions of votes are added together but, to stop the other pages having too much influence, this total
vote is damped down by multiplying it by 0.85 (the factor "d”)
(1 – d) – The (1 – d) bit at the beginning is a bit of probability math magic so the "sum of all web pages' PageRanks
will be one”.
• The PageRanks form a probability distribution over web pages, so the sum of PageRanks of all web pages will be
1.
Page Rank Calculation
• The PR of each page depends on the PR of the pages pointing to it. But we won't know what PR those pages have
until the pages pointing to them have their PR calculated and so on… And when you consider that page links can
form circles it seems impossible to do this calculation! But actually it's not that bad.
• we can just go ahead and calculate a page's PR without knowing the final value of the PR of the other pages. That
seems strange but, basically, each time we run the calculation we're getting a closer estimate of the final value. So
all we need to do is remember the each value we calculate and repeat the calculations lots of times until the
numbers stop changing much. This stage is said to be convergence state. Let us take the following example.

• Each page has one outgoing link (the outgoing count is 1, i.e. C(A) = 1 and C(B) = 1).
• we don't know what their PR should be to begin with, so let's take a guess at 1.0 and do some calculations:

• d = 0.85
PR(A) = (1 – d) + d(PR(B)/1) PR(B) = (1 – d) + d(PR(A)/1)
• PR(A) = 0.15 + 0.85 * 1 = 1 PR(B) = 0.15 + 0.85 * 1 = 1
Sec. 21.2.1

Markov chains
• A Markov chain consists of n states, plus an nn transition probability matrix P.
• At each step, we are in one of the states.

• For 1  i,j  n, the matrix entry Pij tells us the probability of j being the next state, given we
are currently in state i.

Pii>0
is OK.

i j
Pij
Dangling links
• PageRank “Dangling links are simply links that point to any page with no outgoing links. They affect the
model because it is not clear where their weight should be distributed, and there are a large number of
them. Often these dangling links are simply pages that we have not downloaded yet……….Because
dangling links do not affect the ranking of any other page directly, we simply remove them from the system
until all the PageRanks are calculated. After all the PageRanks are calculated they can be added back in
without affecting things significantly.” [1]

• A dangling link is a link to a page that has no links going from it, or a link to a page that Google hasn’t
indexed. In both cases Google removes the links shortly after the start of the calculations and reinstates
them shortly before the calculations are finished. In this way, their effect on the PageRank of other pages is
minimal.
Weighted Page Rank Algorithm
• Authors in [2] proposed a Weighted Page Rank (WPR) algorithm which is an addition of the Page Rank
algorithm which assigns larger rank values to more important (popular) pages instead of dividing the rank
value of a page evenly among its out link pages. Each out link page gets a value proportional to its
popularity (its number of in links and out links). The popularity from the number of in links and out links is
recorded as Win (v, u) and Wout (v, u), respectively.
• Win (v, u) is the weight of link (v, u) calculated based on the number of in links of page u and the number
of in links of all reference pages of page v. Wout (v, u) is the weight of link (v, u) calculated based on the
number of out links of page u and the number of out links of all reference pages of page v.

• Where Iu and Ip represent the number of in links of page u and page p, respectively. R (v) denotes the
reference page list of page v. Where Ou and Op represent the number of out links of page u and page p,
respectively. R (v) denotes the reference page list of page v.
• Considering the importance of pages, the original Page Rank formula is modified as:
Page ranking based on visit of link
• Another algorithm [3] considers user’s browsing behavior. This concept is very useful to display most
valuable pages on the top of the result list on the basis of user browsing behavior, which reduce the search
space to a large scale .
• In this algorithm we assign more rank value to the outgoing links which is most visited by users. In this
manner a page rank value is calculated based on visits of inbound links.
• The modified version based on VOL is given in equation:

• Here d is a dampening factor, u represents a web page, B (u) is the set of pages that point to u, PR (u) and PR
(v) are rank scores of page u and v respectively, Lu is the number of visits of link which is pointing page u
from v. TL (v) denotes total number of visits of all links present on v.
Weighted PageRank based on visit of links
• In [4], authors proposed an algorithm that gives more rank value is assigned to the outgoing links which is
most visited by users and received higher popularity from number of in link.
• Here the popularity of out links is not considered which is considered in the original algorithm. The
advanced approach in the new algorithm is to determine the user’s usage trends.
• The user’s browsing behavior can be calculated by number of hits (visits) of links.
• The modified version based on WPR (VOL) is given as:

• Here d is a dampening factor, u represents a web page, B (u) is the set of pages that point to u, WPRVOL
(u) and WPRVOL (v) are rank scores of page u and v respectively, Lu is the number of visits of link which
is pointing page u from v. TL (v) denotes total number of visits of all links present on v.
DistanceRank Algorithm
• DistanceRank algorithm [5] employs a novel recursive method based on reinforcement learning [24] which
considers distance between pages as punishment, called “DistanceRank” to compute ranks of web pages.

• The number of ‘average clicks’ between two pages is defined as distance.

• The main objective of this algorithm is to minimize distance or punishment so that a page with smaller
distance to have a higher rank.

• Most of the current ranking algorithms have the “rich-get-richer” problem i.e. the popular high rank web
pages become more and more popular and the young high quality pages are not picked by the ranking
algorithms.

• The DistanceRank algorithm algorithms is less sensitive to the “rich-get-richer” problem and finds
important pages faster than others.

• It uses the logic that normally related pages are linked to each other so the distance based solution can find
pages with high qualities more quickly.
Parallel PageRank Algorithms
PR on PC Cluster [8]
• Efficient computation of PageRank is required due to huge size of web graph. Recent researchers discuss
some algorithmic and architectural approaches to speed up the computation of PageRank.
• To compute PageRank vector on Parallel platform first this algorithm store hyperlink matrix into binary
link
• structure file 𝐵 that is partitioned into 𝛽 chunks. Here 𝛽 is the number of systems in a cluster setup.
• Let there are T records in binary structure file and file is described as follows:

• In this figure, file B have two attribute i.e. dest_id and src_id that contains incoming and outgoing pages’ id
e.g. page 1 is pointed by page 1029 and page 2 is pointed by page 101 and page 23.
• File 0, and file 1 contains number of outgoing link and incoming link of page of file B.
Parallel PageRank Algorithms(cont.)

• Parallel Adaptive PageRank algorithm: [9] proposed a parallel PageRank computation on PC Cluster.
Kamvar et al. [10] found that PageRank vector for about 2/3 of pages it converges very early than for other
pages. So based on this concept they proposed adaptive PageRank method that avoid the overhead of
recomputation of already converged web pages.

• Parallel PageRank computation by using Gauss-Seidel/Jacobi method: [11] accelerate the PageRank
computation by applying Gauss-Seidel or Jacobi distance measures to improve convergence of PageRank
algorithm.

• PPR – MT algorithm: [12] based on Multi-threaded model of MPI. In this algorithm MPI is used as a
communication model between processes on different systems, and inter-thread method is used for
communication between threads. They have partitioned the binary link structure file to distribute it equally
among cluster nodes. The file B and I are partitioned and distributed among the cluster nodes, while the file
O is not partitioned.
Parallel PR Algorithms: A Comparative View

The following table is a comparative view of the parallel PageRank algorithms discussed in this presentation.

Load Parallelism Type Memory Used

Balancing
ALGORITH Data Task Distribute Shared
MS: d
Parallel PR on No Yes No Yes No
PC Cluster [8]
Adaptive [9,10] No Yes No Yes No
Parallel PR MPI Yes Yes Yes(Thread Yes Yes
MT level)
[12]
Parallel PR on Yes Yes No Yes No
Gauss/Jacobi [11]
HITS

• Hyperlink Induced Topic Search(HITS) Algorithm [13] ranks the web page by processing in links and
out links of the web pages. In this algorithm a web page is named as authority if the web page is
pointed by many hyperlinks and a web page is names as HUB if the page points to various hyperlinks.
Authorities and hubs are illustrated in Figure
• Hubs and authorities are assigned respective scores. Scores are computed in a mutually reinforcing
way; an authority pointed to by several highly scored hubs should be a strong authority while a hub that
points to several highly scored authorities should be a popular hub.
• Let ap and hp represent the authority and hub scores of page p, respectively. B (p) and I (p) denote the
set of referrer and reference pages of page p, respectively. The scores of hubs and authorities are
calculated as follows;
HITS

The page’s authority weight is proportional to the sum of the hub weights of pages that it links
to it, Kleinberg. Similarly, a page’s hub weight is proportional to the sum of the authority
weights of pages that it links to. The following figure shows an example of the calculation of
authority and hub scores.

HITS is a purely link-based algorithm. It is used to rank pages that are retrieved from the web,
based on their textual contents to a given query. Once these pages have been assembled, the
HITS algorithm ignores textual content and focuses itself on the structure of the web only.
Constraints of HITS

The following are the constraints of HITS algorithm :

• Hubs and authorities: It is not easy to distinguish between hubs and authorities because many sites
are hubs as well as authorities.
• Topic drift: Sometime HITS may not produce the most relevant documents to the user queries
because of equivalent weights.
• Automatically generated links: HITS gives equal importance for automatically generated links which
may not produce relevant topics for the user query.
• Efficiency: HITS algorithm is not efficient in real time.

HITS was used in a prototype search engine called Clever for an IBM research project. Because of the
above constraints HITS could not be implemented in a real time search engine.
References
[1] L. Page, S. Brin, R. Motwani, and T. Winograd, “The Pagerank Citation Ranking: Bringing order to the Web”. Technical Report, Stanford Digital
Libraries SIDL-WP-1999-0120, 1999.
[2] Wenpu Xing and Ali Ghorbani, “Weighted Page Rank Algorithm”, Proceedings of the Second Annual Conference on Communication Networks and
Services Research (CNSR ’04), IEEE, 2004.
[3] Gyanendra Kumar, Neelam Duhan, and Sharma A. K., “Page Ranking Based on Number of Visits of Web Pages”, International Conference on
Computer & Communication Technology (ICCCT)-2011, 978-1-4577-1385-9.
[4] Neelam Tyagi, Simple Sharma, “Weighted Page Rank Algorithm based on number of visits of links of web page”, International Journal of Soft
Computing and Engineering (IJCSE)-ISSN: 2231-2307, Volume-2, Issue-3, July 2012
[5] A. M. Zareh Bidoki and N. Yazdani, “DistanceRank: An intelligent ranking algorithm for web pages” Information Processing and Management, Vol 44,
No. 2, pp. 877-892, 2008.
[6] Taruna Kumari, Ashlesha Gupta, Ashutosh Dixit,”Comparative Study of Page Rank and Weighted Page Rank Algorithm”, International Journal of
Innovative Research in Computer and Communication Engineering-ISSN: 2320- 9798, Volume-2, Issue 2, February 2015.
[7] Chen, Y. Y., Gan, Q., & Suel, T, “I/O efficient techniques for computing PageRank”. In Proceedings of the eleventh international conference on
Information and knowledge management (pp. 549-557). ACM, 2002
[8] Rungsawang, A., & Manaskasemsak, B, “PageRank computation using PC cluster”. In European Parallel Virtual Machine/Message Passing Interface
Users’ Group Meeting (pp. 152-159). Springer Berlin Heidelberg, 2006
[9] Rungsawang, A., & Manaskasemsak, B,”Parallel adaptive technique for computing PageRank”. In Parallel, Distributed, and Network-Based
Processing, 2006. PDP 2006. 14th Euromicro International Conference on (pp. 6-pp). IEEE, 2009
[10] Kamvar, S., Haveliwala, T., & Golub, G, “Adaptive methods for the computation of PageRank. Linear Algebra and its Applications”, 386, 51-65,
2004
[11] Kohlschütter, C., Chirita, P. A., & Nejdl, W, “Efficient parallel computation of PageRank”. In European Conference on Information Retrieval (pp.
241-252). Springer Berlin Heidelberg, 2006
[12] Manaskasemsak, Bundit, Putchong Uthayopas, and Arnon Rungsawang. "A mixed MPI-thread approach for parallel page ranking computation." On
the
Move to Meaningful Internet Systems 2006: CoopIS, DOA, GADA, and ODBASE (2006): 1223-1233.
[13] J. Kleinberg, “Authoritative Sources in a Hyper-Linked Environment”, Journal of the ACM 46(5), pp. 604-632, 1999.
Thank You

Pagerank: Standing On The Shoulders of Giants
No ratings yet
Pagerank: Standing On The Shoulders of Giants
10 pages
PageRank's Historical Foundations
No ratings yet
PageRank's Historical Foundations
10 pages
Understanding Information Networks and the Web
No ratings yet
Understanding Information Networks and the Web
37 pages
9 Search Searching
No ratings yet
9 Search Searching
15 pages
Google PageRank Algorithm Explained
No ratings yet
Google PageRank Algorithm Explained
8 pages
Understanding PageRank and Link Analysis
No ratings yet
Understanding PageRank and Link Analysis
12 pages
Link Analysis: PageRank & HITS Overview
No ratings yet
Link Analysis: PageRank & HITS Overview
23 pages
Search Engine Optimization Fundamentals
No ratings yet
Search Engine Optimization Fundamentals
79 pages
Understanding Search Engine Functionality
No ratings yet
Understanding Search Engine Functionality
10 pages
Unit2 Link Analysis
No ratings yet
Unit2 Link Analysis
4 pages
L07
No ratings yet
L07
41 pages
Google Page Rank Algorithm Analysis
No ratings yet
Google Page Rank Algorithm Analysis
5 pages
Introduction to Search Engine Optimization
No ratings yet
Introduction to Search Engine Optimization
10 pages
Google Search Engine Architecture Study
No ratings yet
Google Search Engine Architecture Study
9 pages
Understanding PageRank and Its Impact
No ratings yet
Understanding PageRank and Its Impact
4 pages
How Google's PageRank Revolutionized Search
No ratings yet
How Google's PageRank Revolutionized Search
6 pages
Understanding the PageRank Algorithm
No ratings yet
Understanding the PageRank Algorithm
42 pages
Search Engine Functionality Overview
No ratings yet
Search Engine Functionality Overview
29 pages
SEO Fundamentals and Ranking Factors
No ratings yet
SEO Fundamentals and Ranking Factors
32 pages
Google PageRank Algorithm Explained
100% (1)
Google PageRank Algorithm Explained
10 pages
PageRank Algorithm in Web Search Engines
No ratings yet
PageRank Algorithm in Web Search Engines
26 pages
Relevance Propagation in Hypertext Models
No ratings yet
Relevance Propagation in Hypertext Models
8 pages
Understanding Link Analysis and PageRank
No ratings yet
Understanding Link Analysis and PageRank
104 pages
Understanding Search Engines and Algorithms
No ratings yet
Understanding Search Engines and Algorithms
13 pages
Math Behind the PageRank Algorithm
No ratings yet
Math Behind the PageRank Algorithm
8 pages
Understanding Web Mining Techniques
No ratings yet
Understanding Web Mining Techniques
22 pages
Contextual Information in Hypertext Retrieval
No ratings yet
Contextual Information in Hypertext Retrieval
9 pages
Anatomy of Google's Search Engine
No ratings yet
Anatomy of Google's Search Engine
24 pages
Eigenvector Insights of PageRank Algorithm
No ratings yet
Eigenvector Insights of PageRank Algorithm
13 pages
Analysis of Rank Sink Problem in PageRank Algorithm
No ratings yet
Analysis of Rank Sink Problem in PageRank Algorithm
6 pages
Eigenvector Insights of PageRank Algorithm
No ratings yet
Eigenvector Insights of PageRank Algorithm
13 pages
Web Search Engine Optimization Overview
No ratings yet
Web Search Engine Optimization Overview
32 pages
Link Analysis: PageRank and HITS Explained
No ratings yet
Link Analysis: PageRank and HITS Explained
47 pages
PageRank Algorithm and Web Graph Analysis
No ratings yet
PageRank Algorithm and Web Graph Analysis
17 pages
The Linear Algebra Behind Google
No ratings yet
The Linear Algebra Behind Google
13 pages
PageRank Algorithm and Linear Algebra
No ratings yet
PageRank Algorithm and Linear Algebra
11 pages
Network Analysis of Wikipedia Entries
No ratings yet
Network Analysis of Wikipedia Entries
12 pages
Search Engine Mechanics Explained
No ratings yet
Search Engine Mechanics Explained
13 pages
Link Analysis in Web Mining Overview
No ratings yet
Link Analysis in Web Mining Overview
29 pages
Understanding the PageRank Algorithm
No ratings yet
Understanding the PageRank Algorithm
3 pages
Anatomy of a Large-Scale Search Engine
No ratings yet
Anatomy of a Large-Scale Search Engine
27 pages
Equilibrium Sets in Directed Networks
No ratings yet
Equilibrium Sets in Directed Networks
33 pages
Search Engine Mechanics Explained
No ratings yet
Search Engine Mechanics Explained
10 pages
Method For Node Ranking in A Linked Database
No ratings yet
Method For Node Ranking in A Linked Database
12 pages
SEO Strategies for Improved Rankings
No ratings yet
SEO Strategies for Improved Rankings
7 pages
Web Search and Impact Factor Analysis
No ratings yet
Web Search and Impact Factor Analysis
52 pages
Web Search Engine Architecture Overview
No ratings yet
Web Search Engine Architecture Overview
29 pages
Understanding Search Engine Basics
No ratings yet
Understanding Search Engine Basics
47 pages
Web Search Fundamentals and Ads Analysis
No ratings yet
Web Search Fundamentals and Ads Analysis
52 pages
Deeper Inside Pagerank: Amy N. Langville and Carl D. Meyer
No ratings yet
Deeper Inside Pagerank: Amy N. Langville and Carl D. Meyer
46 pages
Understanding Google's PageRank Algorithm
No ratings yet
Understanding Google's PageRank Algorithm
15 pages
Exploring PageRank Model Enhancements
No ratings yet
Exploring PageRank Model Enhancements
33 pages
Understanding the PageRank Algorithm
No ratings yet
Understanding the PageRank Algorithm
9 pages
Lec 8
No ratings yet
Lec 8
26 pages
Unit 2 SEO
No ratings yet
Unit 2 SEO
52 pages
PageRank and HITS Algorithm Overview
No ratings yet
PageRank and HITS Algorithm Overview
14 pages
Web and Text Mining Overview
No ratings yet
Web and Text Mining Overview
36 pages
Understanding RESTful Web Services
No ratings yet
Understanding RESTful Web Services
35 pages
Understanding Web Services Concepts
No ratings yet
Understanding Web Services Concepts
12 pages
Collaborative Filtering in Recommender Systems
No ratings yet
Collaborative Filtering in Recommender Systems
14 pages
Understanding Homophily in Networks
No ratings yet
Understanding Homophily in Networks
15 pages
Understanding Homophily in Social Networks
No ratings yet
Understanding Homophily in Social Networks
39 pages
Social Network Analysis Basics
No ratings yet
Social Network Analysis Basics
106 pages
SEO Worksheets and Cheat Sheets Pack
No ratings yet
SEO Worksheets and Cheat Sheets Pack
9 pages
SEO Services Guide for New York
No ratings yet
SEO Services Guide for New York
4 pages
Internet Marketing Knowledge Assessment
No ratings yet
Internet Marketing Knowledge Assessment
56 pages
外贸开发工具与客户信息分析
No ratings yet
外贸开发工具与客户信息分析
48 pages
Site Assessment Worksheet
100% (1)
Site Assessment Worksheet
5 pages
SEO Health Checkup Overview
No ratings yet
SEO Health Checkup Overview
13 pages
SEO Basics for Beginners 2023
No ratings yet
SEO Basics for Beginners 2023
24 pages
Seo Sem Vi
No ratings yet
Seo Sem Vi
2 pages
Scribd Manuals Search Resource
No ratings yet
Scribd Manuals Search Resource
82 pages
SEMrush Domain Overview for Science Inc.
No ratings yet
SEMrush Domain Overview for Science Inc.
8 pages
Report Lintasarta Apr 19-258a7e5774 PDF
No ratings yet
Report Lintasarta Apr 19-258a7e5774 PDF
65 pages
Understanding How Search Engines Work
No ratings yet
Understanding How Search Engines Work
3 pages
SEO Services for Adelaide Businesses
No ratings yet
SEO Services for Adelaide Businesses
1 page
SEO Strategies for Enhanced Visibility
50% (2)
SEO Strategies for Enhanced Visibility
27 pages
Ad:Tech London Rebranding Overview
No ratings yet
Ad:Tech London Rebranding Overview
2 pages
Social Media Profile Backlink List
No ratings yet
Social Media Profile Backlink List
6 pages
Blogger SEO & AdSense Guide for Beginners
No ratings yet
Blogger SEO & AdSense Guide for Beginners
6 pages
Manuals Search Engine Overview
No ratings yet
Manuals Search Engine Overview
107 pages
G-Suite for Education Training Checklist
No ratings yet
G-Suite for Education Training Checklist
3 pages
Contrato de Marketing Digital Goat
No ratings yet
Contrato de Marketing Digital Goat
20 pages
Essential Local SEO Strategies for Success
No ratings yet
Essential Local SEO Strategies for Success
3 pages
SEO Setup Purchase Order Details
No ratings yet
SEO Setup Purchase Order Details
2 pages
Manuals Search Engine Overview
No ratings yet
Manuals Search Engine Overview
172 pages
Local Keyword Rank Tracking Strategies
No ratings yet
Local Keyword Rank Tracking Strategies
8 pages
IndiaMART Promotional Package Overview
No ratings yet
IndiaMART Promotional Package Overview
5 pages
SEO and Digital Marketing Resume
No ratings yet
SEO and Digital Marketing Resume
3 pages
Manual de Taller Toyota Hiace 5L e PDF
100% (8)
Manual de Taller Toyota Hiace 5L e PDF
79 pages
Cdmp-10 m4 SEO Study Notes
No ratings yet
Cdmp-10 m4 SEO Study Notes
46 pages
XML Sitemap for Search Engines
No ratings yet
XML Sitemap for Search Engines
809 pages

SEO: The PAGE RANK Algorithm: Presidency University, Bengaluru School of Engineering

Uploaded by

SEO: The PAGE RANK Algorithm: Presidency University, Bengaluru School of Engineering

Uploaded by

PRESIDENCY UNIVERSITY, BENGALURU

 Assumption 1: A hyperlink is a quality signal.

 Assumption 1: A hyperlink is a quality signal.

 Assumption 1: A hyperlink is a quality signal.

 Assumption 1: A hyperlink is a quality signal.

 Assumption 1: A hyperlink is a quality signal.

 Assumption 1: A hyperlink is a quality signal.

 Searching on [text of d2] + [anchor text → d2] is often

 Searching on [text of d2] + [anchor text → d2] is often

 Searching on [text of d2] + [anchor text → d2] is often

 Searching on [text of d2] + [anchor text → d2] is often

 Searching on [text of d2] + [anchor text → d2] is often

 Searching on [text of d2] + [anchor text → d2] is often

 Searching on [text of d2] + [anchor text → d2] is often

 Searching on [text of d2] + [anchor text → d2] is often

 Searching on [text of d2] + [anchor text → d2] is often

 Thus: Anchor text is often a better description of a page’s

 Thus: Anchor text is often a better description of a page’s

 Assumption 1: A link on the web is a quality signal – the

 Assumption 1: A link on the web is a quality signal – the

 Assumption 1: A link on the web is a quality signal – the

 Assumption 1: A link on the web is a quality signal – the

(Source: [Link] WhitePaper_2006_SearchEngineUserBehavior.pdf) 38

PR(A) = (1-d) + d (PR(T1)/C(T1) + … + PR(Tn)/C(Tn))

• The number of ‘average clicks’ between two pages is defined as distance.

Load Parallelism Type Memory Used

The following are the constraints of HITS algorithm :

You might also like