| block.ids.from.blocking | Returns the block ids associated with a blocking method. |
| block_setup_v2 | Function that divides all records into bins using locality sensitive hashing and using TLSH (based upon community detection technique) |
| compare_buckets | Function that creates a similarity graph and divides it into communities (or blocks) for entity resolution |
| confusion.from.blocking | Perform evaluations (recall) for blocking. |
| eval.blocksetup | Function to evaluate the blocking step |
| extract_pairs_from_band | Function that extracts pairs of records from a band in the signature matrix M import bit64 |
| hash_signature | Function to take a signature matrix M composed of b bands and r rows and return a bucket for each band for each record |
| minhash_v2 | Function to create a matrix of minhashed signatures |
| my_hash | Function that applies a hash function to each column of the band from the signature matrix import bit64 |
| primest | Function to generate all primes larger than an integer n1 (lower limit) and less than any other integer n2 (upper limit) |
| reduction.ratio | Returns the reduction ratio associated with a blocking method |
| reduction.ratio.from.blocking | Returns the reduction ratio associated with a blocking method |
| rhash_funcs | Function to generate a vector of random hash functions (or optionally one vector-valued function) |
| shingled_record_to_index_vec | Function to convert to tell what index the shingle corresponds to in the record |
| shingles | Function to shingle (token or gram) a string into its k components |