๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ

๐Ÿบ ๋งค์ผ ์•Œ๊ณ ๋ฆฌ์ฆ˜

Algorithm week 1: Union-Find

Algorithms, Part 1week1 Lecture note: 

implemented codes in python3.



Steps to developing a usable algorithm.

๔€‰พ Model the problem. ( try to understand the main elements of the problem that needs to be solved. )

ํ•ด๊ฒฐํ•ด์•ผํ•˜๋Š” ์š”์†Œ๋ฅผ ์ดํ•ดํ•˜๊ณ  ๋ฌธ์ œ๋ฅผ ๋ชจ๋ธ๋งํ•œ๋‹ค.

๔€‰พ Find an algorithm to solve it.

๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์•Œ์•„๋‚ธ๋‹ค.

๔€‰พ Fast enough? Fits in memory?

์ถฉ๋ถ„ํžˆ ๋น ๋ฅธ์ง€, ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰์€ ์ ๋‹นํ•œ์ง€ ๊ณ ๋ คํ•œ๋‹ค.

๔€‰พ If not, figure out why.

์•„๋‹ˆ๋ผ๋ฉด, ์™œ ๊ทธ๋Ÿฐ์ง€ ์•Œ์•„๋‚ธ๋‹ค.

๔€‰พ Find a way to address the problem.

๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์„ ์ฐพ์•„๋‚ธ๋‹ค.

๔€‰พ Iterate until above things are satisfied.

์œ„์˜ ์š”์†Œ๋“ค์ด ๋งŒ์กฑ๋ ๋•Œ๊นŒ์ง€ ๋ฐ˜๋ณต.




overview


Lecture1: 


Unionโˆ’Find. We illustrate our basic approach to developing and analyzing algorithms by considering the dynamic connectivity problem. We introduce theunionโˆ’find data type and consider several implementations (quick find, quick union, weighted quick union, and weighted quick union with path compression). Finally, we apply the union-find data type to the percolation problem from physical chemistry.




Dynamic connectivity

Dynamic connectivity is the algorithm for finding whether the nodes are connected or not.
๋™์ ์—ฐ๊ฒฐ์€ ๋…ธ๋“œ๊ฐ„์˜ ์—ฐ๊ฒฐ์„ ์—ฌ๋ถ€๋ฅผ ์•Œ์•„๋‚ด๊ธฐ์œ„ํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด๋‹ค.

Dynamic connectivity example



  • symmetric(์ˆœ์ฐจ): If q is connected to p, then p is connected to q. 
                           q๊ฐ€ p์™€ ์—ฐ๊ฒฐ๋˜์—ˆ๋‹ค๋ฉด, p๋„ q์™€ ์—ฐ๊ฒฐ๋œ๋‹ค.
  • transitive(์ „์ด): If q is connected to p and p is connected to r, then q is connected to r.
                          q ๊ฐ€ p์™€ ์—ฐ๊ฒฐ๋˜์–ด ์žˆ๊ณ  p๊ฐ€ r๊ณผ ์—ฐ๊ฒฐ๋˜์—ˆ๋‹ค๋ฉด, q๋„ r๊ณผ ์—ฐ๊ฒฐ๋œ๋‹ค.
  • reflexive(์žฌ๊ท€): p is connected to p.
                       ๋ชจ๋“  ๋…ธ๋“œ๋Š” ์ž๊ธฐ ์ž์‹ ๊ณผ ์—ฐ๊ฒฐ๋˜์–ด์žˆ๋‹ค.




Goal

Write a program get 2 integers from a user and return True if they are connected( else return False).

2๊ฐœ์˜ ์ •์ˆ˜๋ฅผ ์ž…๋ ฅ๋ฐ›๊ณ  ํ•ด๋‹น ๋…ธ๋“œ๋“ค์ด ์—ฐ๊ฒฐ๋˜์–ด์žˆ์œผ๋ฉด True, ์—ฐ๊ฒฐ๋˜์–ด์žˆ์ง€ ์•Š์œผ๋ฉด False๋ฅผ ๋ฐ˜ํ™˜ํ•˜๋Š” ํ”„๋กœ๊ทธ๋žจ์„ ์ž‘์„ฑํ•œ๋‹ค.











Quick find 

data structure: simply an integer array id[] of size N.

Quick find overview

class QuickFind():
def __init__(self, N):
# Set id of each object to itself(N array accesses)
self.id = list()
for i in range(N):
self.id += [i]

# Check whether p and q are in the same component(2 array accesses)
def connected (self,p,q):
return self.id[p] == self.id[q]

# Change all entries with id[p](at most 2N + 2 array accesses)
def union (self, p, q):
pid = self.id[p]
qid = self.id[q]
for j in range(len(self.id)):
if self.id[j]==pid: self.id[j] = qid

algorithm

initialize

union 

 find

 quick-find

 N

 N

 1


This is called eager algorithm. 


It is easy to implement, but It takes O(n^2).

* The union operation is too expensive. *

In particular, if you just have N union commands on N objects

They're either connected or not then that will take quadratic time in squared time. 

And quadratic time is much to slow. we can't accept quadratic time algorithms for large problems. 

The reason is they don't scale. As computers get faster and bigger, quadratic algorithms actually get slower. 

A very rough standard, say, for now, is that people have computers that can run billions of operations per second, and they have billions of entries in main memory. 

 

Quick Union

data structures: tree 

Quick union overview

class QuickUnion():
def __init__(self, N):
# Set id of each object to itself(N array accesses)
self.id = list()
for i in range(N):
self.id += [i]

# Check parent pointers until reach root (depth or i array accesses)
def root(self, n):
while n != self.id[n]:
n = self.id[n]
return n

# Check if p and q have a same root (depth of p and q array accesses)
def connected (self,p,q):
return self.root(p) == self.root(q)

# Change root of p to point to root q (depth of p and q array accesses)
def union (self, p, q):
i = self.root(p)
j = self.root(q)
self.id[i] = j


This is called a lazy approach


it is faster than quick find, but also slower.

The tree could be too tall. (Then find operation gonna be too expensive.)


(worst case)

algorithm

initialize

union 

 find

 quick-find

 N

 N

 1

 quick-union

 N

 N

 N



Improvement 1: Weighted quick union

Modify quick-union to avoid tall trees.

Keep track of the size of each tree (number of objects).

Balance by linking root of a smaller tree to the root of a larger tree.

Weighted quick union overview

class WeightedQuickUnion():
def __init__(self, N):
# Set id of each object to itself(N array accesses)
# extra array sz[i] to count number of objects in the tree rooted at i.
self.id = list()
for i in range(N):
self.id += [i]
self.sz = [1]*len(self.id)


# Check parent pointers until reach root (depth or i array accesses)
def root(self, n):
while n != self.id[n]:
n = self.id[n]
return n

# Check if p and q have a same root (depth of p and q array accesses)
def connected (self,p,q):
return self.root(p) == self.root(q)

# Link root of smaller tree to root of larger tree
def union (self, p, q):
i = self.root(p)
j = self.root(q)
if i == j: return
if self.sz[i] < self.sz[j]:
self.id[i] = j
self.sz[j] += self.sz[i]
else:
self.id[j] = i
self.sz[i] += self.sz[j]


algorithm

initialize

union 

 find

 quick-find

 N

 N

 1

 quick-union

 N

 N

 N

 weighted quick-union

 N

 log N

 log N


Improvement 2: Path compression

class PathCompression_QU():
def __init__(self, N):
# Set id of each object to itself(N array accesses)
# extra array sz[i] to count number of objects in the tree rooted at i.
self.id = list()
for i in range(N):
self.id += [i]
self.sz = [1]*len(self.id)


# Check parent pointers until reach root (depth or i array accesses)
def root(self, n):
while n != self.id[n]:
self.id[n] = self.id[self.id[n]]
n = self.id[n]
return n

# Check if p and q have a same root (depth of p and q array accesses)
def connected (self,p,q):
return self.root(p) == self.root(q)

# Link root of smaller tree to root of larger tree
def union (self, p, q):
i = self.root(p)
j = self.root(q)
if i == j: return
if self.sz[i] < self.sz[j]:
self.id[i] = j
self.sz[j] += self.sz[i]
else:
self.id[j] = i
self.sz[i] += self.sz[j]

algorithm

 worst-case time

 quick-find

 M N

 quick-union

 M N

 weighted quick-union

 N + M log N

 quick-union + path compression

 N + M log N

 weighted quick-union + path compression

 N + M lg* N