Asksia AI LOGO

Sia

Question
Computer Science
Posted 9 months ago
Exercise 
3
.
3
.
1
 : Verify the theorem from Section 
3
.
3
.
3
,
 which relates the Jac
-
 card similarity to the probability of minhashing to equal values, for the partic
-
 ular case of Fig. 
3
.
2
.
 
(
a
)
 Compute the Jaccard similarity of each of the pairs of columns in Fig. 
3
.
2
.
 
!
 
(
b
)
 Compute, for each pair of columns of that figure, the fraction of the 
1
2
0
 permutations of the rows that make the two columns hash to the same value. Exercise 
3
.
3
.
1
: Verify the theorem from Section 
3
.
3
.
3
,
 which relates the Jac
-
 card similarity to the probability of minhashing to equal values, for the partic
-
 ular case of Fig. 
3
.
2
.
 
(
a
)
 Compute the Jaccard similarity of each of the pairs of columns in Fig. 
3
.
2
.
 
!
 
(
b
)
 Compute, for each pair of columns of that figure, the fraction of the 
1
2
0
 permutations of the rows that make the two columns hash to the same value. Si S
2
 S
4
 Element b e a d O O 
1
 
1
 O S
3
 
1
 
1
 o 
1
 O O o 
1
 
1
 
1
 
с
 
1
 
3
.
3
: A permutation of the rows of Fig. 
3
.
2
For the above question what is meant by true permutation
Sign in to unlock the answer
Answer from Sia
Posted 9 months ago
A true permutation in the context of the given exercise refers to a rearrangement of the rows of the matrix in Fig. 3.2.
To verify the theorem from Section 3.3.3, which relates the Jaccard similarity to the probability of minhashing to equal values, follow these steps:
(a) Compute the Jaccard similarity of each of the pairs of columns in Fig. 3.2:
The Jaccard similarity between two sets A A and B B is defined as:
J(A,B)=ABABJ(A, B) = \frac{|A \cap B|}{|A \cup B|}
For each pair of columns, identify the sets of rows where the value is 1, then compute the intersection and union of these sets to find the Jaccard similarity.
(b) Compute, for each pair of columns of that figure, the fraction of the 120 permutations of the rows that make the two columns hash to the same value:
The probability that two columns hash to the same value under a random permutation is equal to their Jaccard similarity.
For each pair of columns, count the number of permutations that result in the same minhash value and divide by the total number of permutations (120).
This verifies the theorem that the probability of minhashing to equal values is equal to the Jaccard similarity.

Not the question you are looking for? Ask here!

Enter question by text

Enter question by image

Unlock Smarter Learning with AskSia Super!

Join Super, our all-in-one AI solution that can greatly improve your learning efficiency.

30% higher accuracy than GPT-4o
Entire learning journey support
The most student-friendly features
Study Other Question